Skip to content
Search! & Match! API
Query Extraction Service
latest

Query Extraction Service

Query Extraction Service🔗

This service is only available when Match is enabled for your environment.

Method extractFromUrl🔗

Method call🔗

extractFromUrl(environment, password, url, extractorName) : Query

Description🔗

The query extraction from URL method receives a URL as input and outputs a query as result. Query extraction is creating a query from a semantically parsed document. For example, automatically creating a query from a vacancy to search for suitable candidates. If the parameter extractorName is provided, it uses that specific extractor. Otherwise the configured default extractor is used.

Parameters🔗

Parameter Name Type Description
environment string identifier of a search environment
password string password for the search environment
url string URL of the document to be downloaded and processed by the Textkernel Parser
extractorName string optional. The name of extraction service that will be used for query parsing. Used in case there are multiple (different) query templates set up for the environment. Please contact Textkernel Support for more information.

Returns🔗

Result Name Type Description
query String See format of the Query Language

Pre-Condition🔗

  • Parsing is set up for your environment.
  • The URL must be accessible to Textkernel.

Post-Condition🔗

None.

Error Handling🔗

Error Code Description
EMPTY_ARGUMENT One or more mandatory arguments are empty.
INVALID_PASSWORD The password is incorrect.
ENVIRONMENT_NOT_AVAILABLE The environment is not available (see log-file for possible errors).
QUERY_EXTRACTION_NOT_AVAILABLE The query extraction is not available.
URL_NOT_FOUND The given external URL cannot be downloaded by the Textkernel Parser.
QUERY_EXTRACTION_EXECUTION_ERROR An error occurred while processing the document.

Method extractFromUrlWithToken🔗

Method call🔗

extractFromUrlWithToken(accessToken, url, extractorName) : Query

Description🔗

This endpoint performs differently depending on the request protocol (http, https, doc) of the url parameter.

  • doc:// This URI is expected to be in the format as described in Match Queries. In this case it retrieves the document from Search's own docstore and sends it to the extractor to generate a query. If the URI contains an extractorName that one is used, otherwise the endpoint's extractorName parameter gets evaluated, and if neither is set the configured default extractor is used.
  • http://, https:// The URL parameter is expected to be a valid external URL that is forwarded by this service to the configured extractor account with the specified name. If no extractorName is provided, the configured default extractor is used. The generated query is returned.

HTTP Servlet🔗

When the service is called as an HTTP POST servlet instead of a SOAP service call the parameters are:

  • URL: https://home.textkernel.nl/match-SearchBox3/queryExtractionUrl
  • Parameters: accessToken, url, extractorName.

Method extractFromFile🔗

Method call🔗

extractFromFile(environment, password, filename, fileContent, extractorName) : Query

Description🔗

The query extraction from file method receives a file as input and outputs a query as result. The Textkernel Parser needs to be enabled for the environment to process the file and return a templated result as a query. If extractorName is provided, uses that specific extractor if it is null or does not match to any extractor defined in the environment configuration, default extractor is used.

Parameters🔗

Parameter Name Type Description
environment string identifier of a search environment
password string password for the search environment
filename string Filename of the document to process.
fileContent byte array Binary file content of the document to process.
extractorName string optional. The name of extraction service that will be used for query parsing. Used in case there are multiple (different) query templates set up for your environment. Please contact Textkernel Support for more information.

Returns🔗

Result Name Type Description
query String See format of the Query Language

Pre-Condition🔗

  • Document Parsing is set up for your environment.
  • The file must be in a format that is supported by the Textkernel Parser supports (e.g. docx, PDF, HTML).

Post-Condition🔗

None.

Error Handling🔗

Error Code Description
EMPTY_ARGUMENT One or more mandatory arguments are empty.
INVALID_PASSWORD The password is incorrect.
ENVIRONMENT_NOT_AVAILABLE The environment is not available (see log-file for possible errors).
QUERY_EXTRACTION_NOT_AVAILABLE The quey extraction is not available.
QUERY_EXTRACTION_EXECUTION_ERROR An error occurred while processing the document.

Method extractFromFileWithToken🔗

Method call🔗

extractFromFileWithToken(accessToken, filename, fileContent, extractorName) : Query

Description🔗

The method is identical with the above described extract method but requires a valid accessToken coming from the authentication service instead of the environment name, password, and access options parameters.

HTTP Servlet🔗

When the service is called as an HTTP POST servlet instead of a SOAP service call the parameters are:

  • URL: https://home.textkernel.nl/match-SearchBox3/queryExtractionFile
  • Parameters: accessToken, file, extractorName.