Parsing API Quick Start Guide🔗

Overview🔗

The parsing API is designed to process CVs/resumes and job description documents, extracting the parsed data in a structured JSON or XML format. This API supports both HTTP POST and SOAP methods.

If you need to parse both CVs/resumes and job documents, distinct credentials will be provided for each category.

This guide serves as a quick start introduction to using the parsing API. For a comprehensive understanding, please consult the Sourcebox API Reference for the complete API documentation.

Postman collection

Download this Postman collection Textkernel Parsing API to get started with the HTTP POST/SOAP requests.

Using HTTP POST🔗

To make use of the parsing service, you'll need to send an HTTP POST request as a multipart stream with Content-Type set to multipart/form-data.

The service can be accessed using the following endpoint:

https://{data-center-url}/match/extract.do?useJsonErrorMsg=true

Required parameters:

Parameter	Description	Content type
account	Your account name	string
username	Your username	string
password	Your password	string
uploaded_file	The document to process	binary data

Here's an example curl command:

curl https://{data-center-url}/match/extract.do?useJsonErrorMsg=true \
    --form account=YOUR_ACCOUNT_NAME \
    --form username=YOUR_USERNAME \
    --form password=YOUR_PASSWORD \
    --form uploaded_file=@/path/to/your/file

Using SOAP🔗

You can invoke the method processDocument of the WSDL mentioned below to call the parsing service through SOAP.

https://{data-center-url}/match/soap/extract?wsdl

Required parameters for the method processDocument:

Parameter	Description	Content type
account	Your account	string
username	Your username	string
password	Your password	string
filename	File name to process (optional)	string
fileContent	Base64-encoded file content	binary data

Here's an example SOAP call (usable in tools like SOAPUI):

<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" \
   xmlns:ext="http://home.textkernel.nl/sourcebox/soap/extract">
   <soapenv:Header/>
   <soapenv:Body>
      <ext:extract>
         <account>[ACCOUNT NAME]</account>
         <username>[USERNAME]</username>
         <password>[PASSWORD]</password>
         <filename>[OPTIONAL FILENAME]</filename>
         <fileContent>[BASE64 ENCODED FILE CONTENT]</fileContent>
      </ext:extract>
   </soapenv:Body>
</soapenv:Envelope>

Datacenter URLs🔗

The data center URL varies based on your specific region. The email or documentation you received upon creating your account should include the appropriate data center URL to use. It should match to one of the options listed in the Infrastructure page.

If you have not received the data center URL as part of the initial email or documentation, kindly reach out to Textkernel Support. They will be able to provide you with the accurate URL for your region.