How to setup AWS opensearch with master user mode and indexing in Python


AWS opensearch is AWS’s version of elasticserch. For faster development and easy interaction with AWS opensearch service, here is how one can configure the domain to use master user name and password,
and get public access.

To create an OpenSearch Service domain using the console:

  1. Go to https://aws.amazon.com and choose Sign In to the Console.
    Under Analytics, choose Amazon OpenSearch Service.

  2. Choose Create domain.

Provide a name for the domain.

For the domain creation method, choose Standard create.

3.
To quickly configure a production domain with best practices, you can choose Easy create. For the development and testing purposes of this tutorial, we’ll use Standard create.

  1. For templates, choose Dev/test.

  2. For the deployment option, choose Domain with standby.

  3. For Version, choose the latest version.

  4. For now, ignore the Data nodes, Warm and cold data storage, Dedicated master nodes, Snapshot configuration, and Custom endpoint sections.

  5. For simplicity in this tutorial, use a public access domain. Under Network, choose Public access.

  6. In the fine-grained access control settings, keep the Enable fine-grained access control check box selected.
    Select Create master user and provide a username and password.

  7. For now, ignore the SAML authentication and Amazon Cognito authentication sections.

  8. For Access policy, choose Only use fine-grained access control. In this tutorial, fine-grained access control handles authentication, not the domain access policy.

Ignore the rest of the settings and choose Create. New domains typically take 15–30 minutes to initialize, but can take longer depending on the configuration. After your domain initializes, select it to open its configuration pane. Note the domain endpoint under General information (for example, https://search-my-domain.us-east-1.es.amazonaws.com), which you’ll use in the next step.

If you get error by clicking the dashboard or later when requesting in Python, some error like this:

User: anonymous is not authorized to perform: es:ESHttpGet with an explicit deny in a resource-based policy

Double check step 11, make sure you are choosing the Only use fine-grained access control.

how to index one record in Python

Need to replace domain_endpint and user name and passwords with your domain.

import requests
from requests.auth import HTTPBasicAuth

# Endpoint URL
url = f'{domain_endpoint}/movies/_doc/1'


# Authentication details
auth = HTTPBasicAuth(master_user_name, master_pwd)

# JSON data to be sent in the request
data = {
"director": "Burton, Tim",
"genre": ["Comedy", "Sci-Fi"],
"year": 1996,
"actor": ["Jack Nicholson", "Pierce Brosnan", "Sarah Jessica Parker"],
"title": "Mars Attacks!"
}

# Headers
headers = {
'Content-Type': 'application/json'
}

# Making the PUT request
response = requests.put(url, json=data, auth=auth, headers=headers)

# Printing the response (optional)
print(response.status_code)
print(response.text)

how to index bulk texts in Python

import requests
from requests.auth import HTTPBasicAuth

# Endpoint URL
url =f'{domain_endpoint}/_bulk'

# Authentication details
auth = HTTPBasicAuth(master_user_name, master_pwd)

# Headers
headers = {
'Content-Type': 'application/json'
}

# Read the contents of the JSON file
with open('bulk_movies.json', 'r') as file:
data = file.read()

# Making the POST request
response = requests.post(url, data=data, auth=auth, headers=headers)

# Printing the response (optional)
print(response.status_code)
print(response.text)


how to query texts in python

import requests
from requests.auth import HTTPBasicAuth

# Endpoint URL
url =f'{domain_endpoint}/movies/_search'

# Parameters
params = {
'q': 'John',
'pretty': 'true'
}

# Authentication
auth = HTTPBasicAuth(master_user_name, master_pwd)

# Making the GET request
response = requests.get(url, params=params, auth=auth)

# Printing the response (optional)
print(response.status_code)
print(response.text)



Author: robot learner
Reprint policy: All articles in this blog are used except for special statements CC BY 4.0 reprint policy. If reproduced, please indicate source robot learner !
  TOC