Getting Started with the Tx Platform🔗︎
Info
The Tx Platform provides self-service access to the products listed in this documentation through the Tx Console. You must create a Tx Platform account in order to use the services on this platform.
If you are a Textkernel customer and are not using the Tx Platform, consult the TK Platform documentation.
First Steps🔗︎
-
This documentation is for technical details about the API only. Before starting your integration, you should read the Product Documentation for the products you plan to integrate with. We will show you the most important information on how to create a quick and successful integration.
- Resume Parser
- Job Parser
- Search & Match
- Bimetric Scoring
- Skills Intelligence
- Then, read the Acceptable Use Policy that will require your programming expertise in order to be in compliance.
- If programming in Java or C#, install the Tx Platform SDK using the links in the section below. Otherwise, review the code samples and endpoint documentation throughout the rest of this page.
SDK🔗︎
The easiest and fastest way to get started programming your integration to the Version 10 API is by using our SDKs. We currently offer SDKs in C# and Java, with others to follow. This is, by far, the best way to use our software. You can parse a document with as little as 3 lines of code!
Each GitHub link below has installation instructions, documentation, and many code examples to help get you started. The SDKs can also be found directly using your preferred package manager.
Programming Language | GitHub |
---|---|
C# | https://github.com/textkernel/tx-dotnet |
Java | https://github.com/textkernel/tx-java |
Code Samples🔗︎
Here we have a few code samples to get you up and running quickly:
//this code uses the official SDK (see https://github.com/textkernel/tx-dotnet)
TxClient client = new TxClient("12345678", "abcdefghijklmnopqrstuvwxyz", DataCenter.US);
//A Document is an unparsed File (PDF, Word Doc, etc)
Document doc = new Document("resume.docx");
//when you create a ParseRequest, you can specify many configuration settings
//in the ParseOptions. See https://developer.textkernel.com/tx-platform/v10/resume-parser/api/
ParseRequest request = new ParseRequest(doc, new ParseOptions());
try
{
ParseResumeResponse response = await client.ParseResume(request);
//if we get here, it was 200-OK and all operations succeeded
//now we can use the response from Textkernel to ouput some of the data from the resume
Console.WriteLine("Name: " + response.EasyAccess().GetCandidateName()?.FormattedName);
Console.WriteLine("Email: " + response.EasyAccess().GetEmailAddresses()?.FirstOrDefault());
Console.WriteLine("Phone: " + response.EasyAccess().GetPhoneNumbers()?.FirstOrDefault());
}
catch (TxException e)
{
//this was an outright failure, always try/catch for TxExceptions when using the TxClient
Console.WriteLine($"Error: {e.TxErrorCode}, Message: {e.Message}");
}
//https://developer.mozilla.org/en-US/docs/Using_files_from_web_applications
var file = $("#input").files[0];
var modifiedDate = (new Date(file.lastModified)).toISOString().substring(0, 10);
var reader = new FileReader();
reader.onload = function (event) {
//the Base64 library can be found at http://cloud.textkernel.com/console/assets/downloads/v10/Libs/base64.zip
var base64Text = Base64.encodeArray(event.target.result);
var data = {
"DocumentAsBase64String": base64Text,
"DocumentLastModified": modifiedDate
//other options here (see https://developer.textkernel.com/tx-platform/v10/resume-parser/api/)
};
//use the correct URL for the data center associated with your account, or your own server if you are self-hosted
//NOTE: this is shown for demonstration purposes only, you should never embed your credentials
// in javascript that is going to be distributed to end users. Instead, your javascript should
// call a back-end service which then makes the POST to Textkernel's API
$.ajax({
"url": ".../v10/parser/resume",
"method": "POST",
"crossDomain": true,
"headers": {
"Accept": "application/json",
"Content-Type": "application/json",
"Tx-AccountId": "12345678",
"Tx-ServiceKey": "eumey7feY5zjeWZW397Jks6PBj2NRKSH"
},
"data": JSON.stringify(data)
});
}
// when the file is read it triggers the onload event above.
reader.readAsArrayBuffer(file);
import base64
import requests #this module will need to be installed
import json
import os.path
import datetime
base64str = ''
filePath = 'resume.docx'
#open the file, encode the bytes to base64, then decode that to a UTF-8 string
with open(filePath, 'rb') as f:
base64str = base64.b64encode(f.read()).decode('UTF-8')
epochSeconds = os.path.getmtime(filePath)
lastModifiedDate = datetime.datetime.fromtimestamp(epochSeconds).strftime("%Y-%m-%d")
#use the correct URL for the data center associated with your account, or your own server if you are self-hosted
url = ".../v10/parser/resume"
payload = {
'DocumentAsBase64String': base64str,
'DocumentLastModified': lastModifiedDate
#other options here (see https://developer.textkernel.com/tx-platform/v10/resume-parser/api/)
}
headers = {
'accept': "application/json",
'content-type': "application/json",
'tx-accountid': "12345678",
'tx-servicekey': "eumey7feY5zjeWZW397Jks6PBj2NRKSH",
}
#make the request, NOTE: the payload must be serialized to a json string
response = requests.request("POST", url, data=json.dumps(payload), headers=headers)
responseJson = json.loads(response.content)
#grab the ResumeData
resumeData = responseJson['Value']['ResumeData']
#access the ResumeData properties with simple JSON syntax:
print(resumeData['ContactInformation']['CandidateName']['FormattedName'])
#for response properties and types, see https://developer.textkernel.com/tx-platform/v10/resume-parser/api/
require 'uri'
require 'net/http'
require 'net/https'
require 'base64'
require 'json'
file_path = 'resume.docx'
file_data = IO.binread(file_path)
modified_date = File.mtime(file_path).to_s[0,10]
# Encode the bytes to base64
base_64_file = Base64.encode64(file_data)
data = {
"DocumentAsBase64String" => base_64_file,
"DocumentLastModified" => modified_date
#other options here (see https://developer.textkernel.com/tx-platform/v10/resume-parser/api/)
}.to_json
#use the correct URL for the data center associated with your account, or your own server if you are self-hosted
uri = URI.parse(".../v10/parser/resume")
https = Net::HTTP.new(uri.host,uri.port)
https.use_ssl = true
headers = {
'Content-Type' => 'application/json',
'Accept' => 'application/json',
'Tx-AccountId' => '12345678', # use your account id here
'Tx-ServiceKey' => 'eumey7feY5zjeWZW397Jks6PBj2NRKSH', # use your service key here
}
req = Net::HTTP::Post.new(uri.path, initheader = headers)
req.body = data
res = https.request(req)
# Parse the response body into an object
respObj = JSON.parse(res.body)
# Access properties such as the GivenName and PlainText
givenName = respObj["Value"]["ResumeData"]["ContactInformation"]["CandidateName"]["GivenName"]
resumeText = respObj["Value"]["ResumeData"]["ResumeMetadata"]["PlainText"]
//this code uses the official SDK (see https://github.com/textkernel/tx-java)
TxClient client = new TxClient("12345678", "abcdefghijklmnopqrstuvwxyz", DataCenter.US);
//A Document is an unparsed File (PDF, Word Doc, etc)
Document doc = new Document("resume.docx");
//when you create a ParseRequest, you can specify many configuration settings
//in the ParseOptions. See https://developer.textkernel.com/tx-platform/v10/resume-parser/api/
ParseRequest request = new ParseRequest(doc, new ParseOptions());
try {
ParseResumeResponse response = client.parseResume(request);
//if we get here, it was 200-OK and all operations succeeded
//now we can use the response from Textkernel to ouput some of the data from the resume
System.out.println("Name: " + response.Value.ResumeData.ContactInformation.CandidateName.FormattedName);
System.out.println("Experience: " + response.Value.ResumeData.EmploymentHistory.ExperienceSummary.Description);
}
catch (TxException e) {
//this was an outright failure, always try/catch for TxExceptions when using the TxClient
System.out.println("Error: " + e.TxErrorCode + ", Message: " + e.getMessage());
}
//If your PHP installation doesn't have an up-to-date CA root certificate bundle, download the one at the curl website and save it on your server:
//http://curl.haxx.se/docs/caextract.html
//Then set a path to it in your php.ini file, e.g. on Windows:
//curl.cainfo=c:\php\cacert.pem
// open the file
$filepath = "resume.docx";
$handle = fopen($filepath, "r");
$contents = fread($handle, filesize($filepath));
fclose($handle);
$modifiedDate = date("Y-m-d", filemtime($filepath));
//encode to base64
$base64str = base64_encode($contents);
$data = ["DocumentAsBase64String" => $base64str, "DocumentLastModified" => $modifiedDate];
//other options here (see https://developer.textkernel.com/tx-platform/v10/resume-parser/api/)
//use the correct URL for the data center associated with your account, or your own server if you are self-hosted
$url = ".../v10/parser/resume";
//setup curl to make the REST call, you can use an external library
//such as Guzzle if you prefer: http://guzzle.readthedocs.io
$curl = curl_init();
curl_setopt($curl, CURLOPT_POST, 1);
curl_setopt($curl, CURLOPT_POSTFIELDS, json_encode($data));
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
$headers = [
"accept: application/json",
"content-type: application/json; charset=utf-8",
"tx-accountid: 12345678",
"tx-servicekey: eumey7feY5zjeWZW397Jks6PBj2NRKSH"
];
curl_setopt($curl, CURLOPT_HTTPHEADER, $headers);
$result = curl_exec($curl);
curl_close($curl);
//$result now has the parsed document, you can now use json_decode if you like to use Value.ResumeData
//for response properties and types, see https://developer.textkernel.com/tx-platform/v10/resume-parser/api/
const http = require('https');
const fs = require('fs');
var filePath = "resume.docx";
var buffer = fs.readFileSync(filePath);
var base64Doc = buffer.toString('base64');
var modifiedDate = (new Date(fs.statSync(filePath).mtimeMs)).toISOString().substring(0, 10);
//other options here (see https://developer.textkernel.com/tx-platform/v10/resume-parser/api/)
var postData = JSON.stringify({
'DocumentAsBase64String': base64Doc,
'DocumentLastModified': modifiedDate
});
//use https://api.eu.textkernel.com/tx/v10/parser/resume if your account is in the EU data center or
//use https://api.au.textkernel.com/tx/v10/parser/resume if your account is in the AU data center
var options = {
host: 'api.us.textkernel.com',
protocol: 'https:',
path: 'tx/v10/parser/resume',
method: 'POST',
headers: {
'Tx-AccountId': '12345678',
'Tx-ServiceKey': 'eumey7feY5zjeWZW397Jks6PBj2NRKSH',
'Accept': 'application/json',
'Content-Type': 'application/json',
'Content-Length': Buffer.byteLength(postData)
}
};
var request = http.request(options, function (response) {
console.log(\`STATUS: \${response.statusCode}\`);
response.setEncoding('utf8');
var responseAsString = '';
response.on('data', (chunk) => {
responseAsString += chunk;
});
response.on('end', () => {
var responseAsJson = JSON.parse(responseAsString);
console.log(responseAsJson.Info);
var resumeData = responseAsJson.Value.ResumeData;
//now you can consume the resumeData
});
});
request.write(postData);
request.end();
Info
All above code samples are provided without warranty and are not necessarily indicative of best practices.
Standard Transaction Cost🔗︎
Endpoints🔗︎
Info
Our REST API is also documented using Swagger. Follow the links below for the appropriate data center to access an HTML page where you can make sample requests.
Data Center | ||
---|---|---|
US | https://api.us.textkernel.com/tx/v10/ | |
EU | https://api.eu.textkernel.com/tx/v10/ | |
AU | https://api.au.textkernel.com/tx/v10/ |
Authentication🔗︎
Our REST API handles authentication via the Tx-AccountId and Tx-ServiceKey headers. These keys were generated during account creation and send to the contacts listed on the account. If authentication fails we return a 401 Unathorized HTTP Status Code.
The most common causes for unauthorized exceptions are:
- Not including the headers in the request
- Making requests to the wrong data center. If you have questions about which data center your account is setup for contact support@textkernel.com
If you recieve a 403 Forbidden Access exception, please confirm that you are using https. We have deprecated the use of unsecured http connections in this verison.
Request Headers🔗︎
It is unnecessary to include these headers when using the Textkernel SDK. Your AccountId and ServiceKey will be entered when creating a TxClient
Header | Data Type | Required | Description |
---|---|---|---|
Tx-AccountId | string | Yes | The Account ID that is provided to you when establishing your Service Account. |
Tx-ServiceKey | string | Yes | The Service key that is provided to you for access to your account’s available credits. |
Content-Type | string | Yes | Indicates the media type of the incoming request body. The only supported value is application/json . |
Tx-TrackingTag | string | No | An optional value that you can use for tracking API usage in the Tx Console. Comma-separated values are accepted here, and the max length is 100 characters. |
Versioning🔗︎
We continuously deploy bug fixes and new features to our API. We limit breaking changes to new versions deployed to new urls unless the change is to fix an error in the output*. In the top of our documentation site you can change the API version of the documentation.
Release notes can be found here.
APIÂ Version | Status | Notes |
---|---|---|
Version 10 | Suggested (new customers/projects) | This is the same as v9 under-the-hood, but features an entirely new/modern input/output structure and official SDKs. |
Version 9 | Suggested (existing customers/projects) | This version is still receiving updates and is fully supported. Full release notes can be found here. |
* In order to keep up with the industry standards we occasionally will make changes including, but not limited to, security and infrastructure areas. Whenever customer impact is foreseen, we'll do our best to communicate at least 30 days in advance.
HTTP Status Codes🔗︎
Our API uses conventional HTTP status codes to describe the overall status of the transaction. The specific code we return are detailed below and mapped to the Info.Code values we return for every transaction:
HTTP Status Code | Info.Code | Description |
---|---|---|
200 - OK | Success, WarningsFoundDuringParsing, PossibleTruncationFromTimeout, SomeErrors | The transaction was successful |
400 - Bad Request | MissingParameter, InvalidParameter, InsufficientData, DataNotFound, CoordinatesNotFound, ConstraintError | Unable to process request |
401 - Unauthorized | AuthenticationError, Unauthorized | The AccountId and/or ServiceKey were invalid |
403 - Forbidden | N/A | The request was made using http instead of https. |
404 - Not Found | DataNotFound | The requested asset wasn't found. |
408 - Request Timeout | Timeout | The transaction reached its timeout limit. |
409 - Conflict | DuplicateAsset | The request could not be completed due to a conflict with the current state of the target resource. |
413 - Payload Too Large | RequestTooLarge | The request is too large to be processed by the server. The max file size is ~16MB on disk. |
422 - Unprocessable Entity | ConversionException | The request made was syntactically correct, but the provided document was unable to be converted to text. |
429 - Too Many Requests | TooManyRequests | Your submission has been rejected without being processed because you were exceeding the allowable batch parsing transaction concurrency limit per the AUP. You have been charged for submitting the transaction. It is your responsibility to resubmit this transaction after you correct the process which caused the concurrency problem. |
500 - Internal Server Error | UnhandledException | An unexpected issue occurred (these are extremely rare). |