Skip to content
Tx Platform
Overview

Getting Started with the Tx Platform🔗︎

Info

The Tx Platform provides self-service access to the products listed in this documentation through the Tx Console. You must create a Tx Platform account in order to use the services on this platform.

If you are a Textkernel customer and are not using the Tx Platform, consult the TK Platform documentation.

First Steps🔗︎

  1. This documentation is for technical details about the API only. Before starting your integration, you should read the Product Documentation for the products you plan to integrate with. We will show you the most important information on how to create a quick and successful integration.

SDK🔗︎

The easiest and fastest way to get started programming your integration to the Version 10 API is by using our SDKs. We currently offer SDKs in C# and Java, with others to follow. This is, by far, the best way to use our software. You can parse a document with as little as 3 lines of code!

Each GitHub link below has installation instructions, documentation, and many code examples to help get you started. The SDKs can also be found directly using your preferred package manager.

Programming Language GitHub
C# https://github.com/textkernel/tx-dotnet
Java https://github.com/textkernel/tx-java

Code Samples🔗︎

Here we have a few code samples to get you up and running quickly:

//this code uses the official SDK (see https://github.com/textkernel/tx-dotnet)
TxClient client = new TxClient("12345678", "abcdefghijklmnopqrstuvwxyz", DataCenter.US);

//A Document is an unparsed File (PDF, Word Doc, etc)
Document doc = new Document("resume.docx");

//when you create a ParseRequest, you can specify many configuration settings
//in the ParseOptions. See https://developer.textkernel.com/tx-platform/v10/resume-parser/api/
ParseRequest request = new ParseRequest(doc, new ParseOptions());

try
{
    ParseResumeResponse response = await client.ParseResume(request);
    //if we get here, it was 200-OK and all operations succeeded

    //now we can use the response from Textkernel to ouput some of the data from the resume
    Console.WriteLine("Name: " + response.EasyAccess().GetCandidateName()?.FormattedName);
    Console.WriteLine("Email: " + response.EasyAccess().GetEmailAddresses()?.FirstOrDefault());
    Console.WriteLine("Phone: " + response.EasyAccess().GetPhoneNumbers()?.FirstOrDefault());
}
catch (TxException e)
{
    //this was an outright failure, always try/catch for TxExceptions when using the TxClient
    Console.WriteLine($"Error: {e.TxErrorCode}, Message: {e.Message}");
}
//https://developer.mozilla.org/en-US/docs/Using_files_from_web_applications

var file = $("#input").files[0];
var modifiedDate = (new Date(file.lastModified)).toISOString().substring(0, 10);
var reader = new FileReader();

reader.onload = function (event) {
  //the Base64 library can be found at http://cloud.textkernel.com/console/assets/downloads/v10/Libs/base64.zip
  var base64Text = Base64.encodeArray(event.target.result);

  var data = {
      "DocumentAsBase64String": base64Text,
      "DocumentLastModified": modifiedDate
      //other options here (see https://developer.textkernel.com/tx-platform/v10/resume-parser/api/)
  };

  //use the correct URL for the data center associated with your account, or your own server if you are self-hosted

  //NOTE: this is shown for demonstration purposes only, you should never embed your credentials
  // in javascript that is going to be distributed to end users. Instead, your javascript should
  // call a back-end service which then makes the POST to Textkernel's API
  $.ajax({
      "url": ".../v10/parser/resume",
      "method": "POST",
      "crossDomain": true,
      "headers": {
          "Accept": "application/json",
          "Content-Type": "application/json",
          "Tx-AccountId": "12345678",
          "Tx-ServiceKey": "eumey7feY5zjeWZW397Jks6PBj2NRKSH"
      },
      "data": JSON.stringify(data)
  });
}

// when the file is read it triggers the onload event above.
reader.readAsArrayBuffer(file);
import base64
import requests #this module will need to be installed
import json
import os.path
import datetime

base64str = ''
filePath = 'resume.docx'

#open the file, encode the bytes to base64, then decode that to a UTF-8 string
with open(filePath, 'rb') as f:
base64str = base64.b64encode(f.read()).decode('UTF-8')

epochSeconds = os.path.getmtime(filePath)
lastModifiedDate = datetime.datetime.fromtimestamp(epochSeconds).strftime("%Y-%m-%d") 

#use the correct URL for the data center associated with your account, or your own server if you are self-hosted
url = ".../v10/parser/resume"
payload = {
  'DocumentAsBase64String': base64str,
  'DocumentLastModified': lastModifiedDate
  #other options here (see https://developer.textkernel.com/tx-platform/v10/resume-parser/api/)
}

headers = {
  'accept': "application/json",
  'content-type': "application/json",
  'tx-accountid': "12345678",
  'tx-servicekey': "eumey7feY5zjeWZW397Jks6PBj2NRKSH",
}

#make the request, NOTE: the payload must be serialized to a json string
response = requests.request("POST", url, data=json.dumps(payload), headers=headers)
responseJson = json.loads(response.content)

#grab the ResumeData
resumeData = responseJson['Value']['ResumeData']

#access the ResumeData properties with simple JSON syntax:
print(resumeData['ContactInformation']['CandidateName']['FormattedName'])
#for response properties and types, see https://developer.textkernel.com/tx-platform/v10/resume-parser/api/
require 'uri'
require 'net/http'
require 'net/https'
require 'base64'
require 'json'

file_path = 'resume.docx'
file_data = IO.binread(file_path)
modified_date = File.mtime(file_path).to_s[0,10]

# Encode the bytes to base64
base_64_file = Base64.encode64(file_data)
data = {
  "DocumentAsBase64String" => base_64_file,
  "DocumentLastModified" => modified_date
  #other options here (see https://developer.textkernel.com/tx-platform/v10/resume-parser/api/)
}.to_json

#use the correct URL for the data center associated with your account, or your own server if you are self-hosted
uri = URI.parse(".../v10/parser/resume")
https = Net::HTTP.new(uri.host,uri.port)
https.use_ssl = true

headers = {
  'Content-Type' => 'application/json',
  'Accept' => 'application/json',
  'Tx-AccountId' => '12345678', # use your account id here
  'Tx-ServiceKey' =>  'eumey7feY5zjeWZW397Jks6PBj2NRKSH', # use your service key here
}

req = Net::HTTP::Post.new(uri.path, initheader = headers)
req.body = data

res = https.request(req)

# Parse the response body into an object
respObj = JSON.parse(res.body)

# Access properties such as the GivenName and PlainText
givenName = respObj["Value"]["ResumeData"]["ContactInformation"]["CandidateName"]["GivenName"]
resumeText = respObj["Value"]["ResumeData"]["ResumeMetadata"]["PlainText"]
//this code uses the official SDK (see https://github.com/textkernel/tx-java)
TxClient client = new TxClient("12345678", "abcdefghijklmnopqrstuvwxyz", DataCenter.US);

//A Document is an unparsed File (PDF, Word Doc, etc)
Document doc = new Document("resume.docx");

//when you create a ParseRequest, you can specify many configuration settings
//in the ParseOptions. See https://developer.textkernel.com/tx-platform/v10/resume-parser/api/
ParseRequest request = new ParseRequest(doc, new ParseOptions());

try {
  ParseResumeResponse response = client.parseResume(request);
  //if we get here, it was 200-OK and all operations succeeded

  //now we can use the response from Textkernel to ouput some of the data from the resume
  System.out.println("Name: " + response.Value.ResumeData.ContactInformation.CandidateName.FormattedName);
  System.out.println("Experience: " + response.Value.ResumeData.EmploymentHistory.ExperienceSummary.Description);
}
catch (TxException e) {
  //this was an outright failure, always try/catch for TxExceptions when using the TxClient
  System.out.println("Error: " + e.TxErrorCode + ", Message: " + e.getMessage());
}
//If your PHP installation doesn't have an up-to-date CA root certificate bundle, download the one at the curl website and save it on your server:
//http://curl.haxx.se/docs/caextract.html
//Then set a path to it in your php.ini file, e.g. on Windows:
//curl.cainfo=c:\php\cacert.pem

// open the file
$filepath = "resume.docx";
$handle = fopen($filepath, "r");
$contents = fread($handle, filesize($filepath));
fclose($handle);

$modifiedDate = date("Y-m-d", filemtime($filepath));

//encode to base64
$base64str = base64_encode($contents);
$data = ["DocumentAsBase64String" => $base64str, "DocumentLastModified" => $modifiedDate];
//other options here (see https://developer.textkernel.com/tx-platform/v10/resume-parser/api/)

//use the correct URL for the data center associated with your account, or your own server if you are self-hosted
$url = ".../v10/parser/resume";

//setup curl to make the REST call, you can use an external library
//such as Guzzle if you prefer: http://guzzle.readthedocs.io
$curl = curl_init();
curl_setopt($curl, CURLOPT_POST, 1);
curl_setopt($curl, CURLOPT_POSTFIELDS, json_encode($data));

curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);

$headers = [
  "accept: application/json",
  "content-type: application/json; charset=utf-8",
  "tx-accountid: 12345678",
  "tx-servicekey: eumey7feY5zjeWZW397Jks6PBj2NRKSH"
];
curl_setopt($curl, CURLOPT_HTTPHEADER, $headers);

$result = curl_exec($curl);
curl_close($curl);

//$result now has the parsed document, you can now use json_decode if you like to use Value.ResumeData
//for response properties and types, see https://developer.textkernel.com/tx-platform/v10/resume-parser/api/
const http = require('https');
const fs = require('fs');

var filePath = "resume.docx";
var buffer = fs.readFileSync(filePath);
var base64Doc = buffer.toString('base64');

var modifiedDate = (new Date(fs.statSync(filePath).mtimeMs)).toISOString().substring(0, 10);

//other options here (see https://developer.textkernel.com/tx-platform/v10/resume-parser/api/)
var postData = JSON.stringify({
  'DocumentAsBase64String': base64Doc,
  'DocumentLastModified': modifiedDate
});

//use https://api.eu.textkernel.com/tx/v10/parser/resume if your account is in the EU data center or
//use https://api.au.textkernel.com/tx/v10/parser/resume if your account is in the AU data center
var options = {
  host: 'api.us.textkernel.com',
  protocol: 'https:',
  path: 'tx/v10/parser/resume',
  method: 'POST',
  headers: {
      'Tx-AccountId': '12345678',
      'Tx-ServiceKey': 'eumey7feY5zjeWZW397Jks6PBj2NRKSH',
      'Accept': 'application/json',
      'Content-Type': 'application/json',
      'Content-Length': Buffer.byteLength(postData)
  }
};

var request = http.request(options, function (response) {
  console.log(\`STATUS: \${response.statusCode}\`);
  response.setEncoding('utf8');

  var responseAsString = '';

  response.on('data', (chunk) => {
    responseAsString += chunk;
  });

  response.on('end', () => {
    var responseAsJson = JSON.parse(responseAsString);
    console.log(responseAsJson.Info);
    var resumeData = responseAsJson.Value.ResumeData;        
    //now you can consume the resumeData
  });
});

request.write(postData);
request.end();

Info

All above code samples are provided without warranty and are not necessarily indicative of best practices.

Standard Transaction Cost🔗︎

This page has moved

Endpoints🔗︎

Info

Our REST API is also documented using Swagger. Follow the links below for the appropriate data center to access an HTML page where you can make sample requests.

Data Center
US https://api.us.textkernel.com/tx/v10/
EU https://api.eu.textkernel.com/tx/v10/
AU https://api.au.textkernel.com/tx/v10/

Authentication🔗︎

Our REST API handles authentication via the Tx-AccountId and Tx-ServiceKey headers. These keys were generated during account creation and send to the contacts listed on the account. If authentication fails we return a 401 Unathorized HTTP Status Code.

The most common causes for unauthorized exceptions are:

  • Not including the headers in the request
  • Making requests to the wrong data center. If you have questions about which data center your account is setup for contact support@textkernel.com

If you recieve a 403 Forbidden Access exception, please confirm that you are using https. We have deprecated the use of unsecured http connections in this verison.

Request Headers🔗︎

It is unnecessary to include these headers when using the Textkernel SDK. Your AccountId and ServiceKey will be entered when creating a TxClient

Header Data Type Required Description
Tx-AccountId string Yes The Account ID that is provided to you when establishing your Service Account.
Tx-ServiceKey string Yes The Service key that is provided to you for access to your account’s available credits.
Content-Type string Yes Indicates the media type of the incoming request body. The only supported value is application/json.
Tx-TrackingTag string No An optional value that you can use for tracking API usage in the Tx Console. Comma-separated values are accepted here, and the max length is 100 characters.

Versioning🔗︎

We continuously deploy bug fixes and new features to our API. We limit breaking changes to new versions deployed to new urls unless the change is to fix an error in the output. In the top of our documentation site you can change the API version of the documentation.

Release notes can be found here.

API Version Status Notes
Version 10 Suggested (new customers/projects) This is the same as v9 under-the-hood, but features an entirely new/modern input/output structure and official SDKs.
Version 9 Suggested (existing customers/projects) This version is still receiving updates and is fully supported. Full release notes can be found here.

HTTP Status Codes🔗︎

Our API uses conventional HTTP status codes to describe the overall status of the transaction. The specific code we return are detailed below and mapped to the Info.Code values we return for every transaction:

HTTP Status Code Info.Code Description
200 - OK Success, WarningsFoundDuringParsing, PossibleTruncationFromTimeout, SomeErrors The transaction was successful
400 - Bad Request MissingParameter, InvalidParameter, InsufficientData, DataNotFound, CoordinatesNotFound, ConstraintError Unable to process request
401 - Unauthorized AuthenticationError, Unauthorized The AccountId and/or ServiceKey were invalid
403 - Forbidden N/A The request was made using http instead of https.
404 - Not Found DataNotFound The requested asset wasn't found.
408 - Request Timeout Timeout The transaction reached its timeout limit.
409 - Conflict DuplicateAsset The request could not be completed due to a conflict with the current state of the target resource.
413 - Payload Too Large RequestTooLarge The request is too large to be processed by the server. The max file size is ~16MB on disk.
422 - Unprocessable Entity ConversionException The request made was syntactically correct, but the provided document was unable to be converted to text.
429 - Too Many Requests TooManyRequests Your submission has been rejected without being processed because you were exceeding the allowable batch parsing transaction concurrency limit per the AUP. You have been charged for submitting the transaction. It is your responsibility to resubmit this transaction after you correct the process which caused the concurrency problem.
500 - Internal Server Error UnhandledException An unexpected issue occurred (these are extremely rare).