Skip to content
Tx Platform
Parse API

Parse a ResumeπŸ”—︎

HTTP Verb Path
POST /v9/parser/resume

Parse a single Resume/CV.

Info

  • You can try this endpoint out at our Swagger page ( US Data Center | EU Data Center | AU Data Center )
  • By default, our resume parser ensures an optimal balance between parsing accuracy and speed. If you have specific preferences, please contact sales@textkernel.com
  • This service is designed to parse resumes/CVs. It assumes that all files passed to it are resumes/CVs. It does not attempt to detect whether a document is a resume/CV or not. It should not be used to try to extract information from other types of documents.
  • Always send the original file, not the result of copy/paste, not a conversion by some other software, not a scanned image, and not a version marked up with recruiter notes or other non-resume information. Be aware that if you pass garbage into the service, then you are likely to get garbage out. The best results are always obtained by parsing the original resume/CV file.
  • In order to provide parsing for a wide range of languages, the parser does not provide low-usage fields for some languages.
  • If you are running batch transactions (i.e. iterating through files in a folder), make sure that you do not try to reparse a file if you get an exception back from the service since you will get the same result each time and credits will be deducted from your account.
  • Batch transactions must adhere to our Acceptable Use Policy.

Scope of useπŸ”—︎

FlexRequests (an optional feature) are powered by Large Language Models (LLMs). LLMs can generate information that is factually incorrect or misleading. It may also output plausible-sounding but false information. Therefore FlexRequests may not always provide accurate or reliable results. Textkernel gives no guarantees or warranties of any kind with the use of FlexRequests, nor does it accept any liability for its use. It is your sole responsibility to use FlexRequests in a responsible and ethical manner, not to use it for harmful, malicious or unethical purposes and to ensure that the use of FlexRequests aligns with applicable laws and regulations.

Request BodyπŸ”—︎

DocumentAsBase64String πŸ”—︎ string required

DocumentAsBase64StringπŸ”—︎

A Base64 encoded string of the resume file bytes. This should use the standard 'base64' encoding as defined in RFC 4648 Section 4 (not the 'base64url' variant). .NET users can use the Convert.ToBase64String(byte[]) method.

RevisionDate πŸ”—︎ string required

RevisionDateπŸ”—︎

Mandatory date, in YYYY-MM-DD format, so that the Parser knows how to interpret dates in the document that are expressed as "current" or "as of" or similar. To find out why this is so important and how to calculate/find it, read here.

OutputHtml πŸ”—︎ bool

OutputHtmlπŸ”—︎

When true, the original file is converted to HTML and stored in the Html property.

OutputRtf πŸ”—︎ bool

OutputRtfπŸ”—︎

When true, the original file is converted to RTF and stored in the Rtf property.

OutputPdf πŸ”—︎ bool

OutputPdfπŸ”—︎

When true, the original file is converted to PDF and stored in the Pdf property as a byte array.

UseLLMParser πŸ”—︎ bool

UseLLMParserπŸ”—︎

When true, the LLM parser will be used. See here for more information.

OutputCandidateImage πŸ”—︎ bool

OutputCandidateImageπŸ”—︎

When true, if the document contains inline images, the image that is most likely to be a photo of the candidate is returned as a byte array.

Configuration πŸ”—︎ string Deprecated

ConfigurationπŸ”—︎

This feature is not recommended and only available as an add-on. Please reach out toΒ sales@textkernel.com.

Optional parser configuration string to be used for parsing. If not specified, the default parser configuration will be used.

SkillsData πŸ”—︎ string[]

SkillsDataπŸ”—︎

This feature is not recommended and only available as an add-on. Please reach out to sales@textkernel.com.

String[] of your custom skills list names and the Textkernel "builtin" skills list. If no list is provided the Textkernel builtin skills list will be used. The parser automatically detects language and looks for a corresponding skills list in that language, if no match is found this list is ignored.

NormalizerData πŸ”—︎ string

NormalizerDataπŸ”—︎

This feature is not recommended and only available as an add-on. Please reach out to sales@textkernel.com.

Name of your custom normalization data file. If no list is provided the Textkernel builtin skills list will be used (english only). When using custom normalization files the language to be used is determined by the Parser (the default fall back language is English if the Parser cannot find a match).

GeocodeOptions πŸ”—︎ object

GeocodeOptionsπŸ”—︎

Get or insert geocode coordinate values (latitude/longitude) during the parse transaction.


GeocodeOptions properties

IncludeGeocoding πŸ”—︎ bool

IncludeGeocodingπŸ”—︎

When set to true we will automatically geocode the address that is parsed out leveraging an api call to our/geocode endpoint, and thus will be charged accordingly. This parameter defaults to false.

Provider πŸ”—︎ string

ProviderπŸ”—︎

The Provider you wish to use to geocode the postal address (current options are "Google", "Bing", or "None"). If not specified, we will default to Google. If you are just trying to update the postal address in the document, please set this to "None". If passing "Google" or "Bing", ProviderKey is requried.

ProviderKey πŸ”—︎ string

ProviderKeyπŸ”—︎

The Provider Key for the specified Provider. If using Bing you must specify your own provider key.

PostalAddress πŸ”—︎ object

PostalAddressπŸ”—︎

The postal address you wish to geocode. For best results, specify as many of the PostalAddress fields as possible. If provided, this address will be used to get the geocode coordinates instead of the address included in the ParsedDocument (if present), however, the address in the ParsedDocument will not be modified.


PostalAddress properties

CountryCode πŸ”—︎ string

CountryCodeπŸ”—︎

The ISO 3166-1 alpha-2 code indicating the country for the postal address.

PostalCode πŸ”—︎ string

PostalCodeπŸ”—︎

The postal code (or zip code) for the postal address

Region πŸ”—︎ string

RegionπŸ”—︎

The region (i.e. State for U.S. addresses) for the postal address.

Municipality πŸ”—︎ string

MunicipalityπŸ”—︎

The municipality (i.e. City for U.S. addresses) for the postal address

AddressLine πŸ”—︎ string

AddressLineπŸ”—︎

The address line (i.e. Street address for U.S. address) for the postal address

GeoCoordinates πŸ”—︎ object

GeoCoordinatesπŸ”—︎

The geographic coordinates (latitude/longitude) for your postal address. Use this if youalready have latitude/longitude coordinates and simply wish to add them to your parsed document. If provided, these values will be inserted into your ParsedDocument and the address included in the ParsedDocument (if present), will not be modified.


GeoCoordinates properties

Latitude πŸ”—︎ float

LatitudeπŸ”—︎

The latitude coordinate value.

Longitude πŸ”—︎ float

LongitudeπŸ”—︎

The longitude coordinate value.

IndexingOptions πŸ”—︎ object

IndexingOptionsπŸ”—︎

When your account is enabled for Matching/Searching you can automatically index documents during the parse transactions.

Skills Normalization must be included to index documents using V2 Skills Taxonomy. These algorithms ignore raw skills and only consider the normalized skill concepts for skills category scoring. This leads to improved scoring and ranking because normalization produces less false negatives than simple exact keyword matching.


IndexingOptions properties

IndexId πŸ”—︎ string

IndexIdπŸ”—︎

When your account is enabled for Matching/Searching you can automatically index documents during the parse transactions. This determines what index to place the parsed document in. This is case-insensitive.

DocumentId πŸ”—︎ string

DocumentIdπŸ”—︎

When your account is enabled for Matching/Searching you can automatically index documents during the parse transactions. This determines what id to give to the parsed document. This is restricted to alphanumeric with dashes and underscores. All values will be converted to lower-case.

CustomIds πŸ”—︎ string[]

CustomIdsπŸ”—︎

The custom ids you want the document to have.

SkillsSettings πŸ”—︎ object

SkillsSettingsπŸ”—︎

Enable skills normalization and enhanced candidate summarization, and specify the version of the skills taxonomy for this parsing transaction.


SkillsSettings properties

Normalize πŸ”—︎ bool

NormalizeπŸ”—︎

When true:

  • Raw skills will be normalized. These will be output under Value.ResumeData.Skills.Normalized. Read moreabout the benefits of using a skills taxonomy.- An enhanced candidate summary is generated, leveraging the taxonomy structure to relate skills to profession groups.

When using TaxonomyVersion V2 (see below), additional charges apply. when normalization is enabled.

When you have access to TaxonomyVersion V1, and did not set the taxonomy to V2 explicitly (see below), normalization is enabled by default and the candidate summary is generated using the V1 taxonomy structure.

TaxonomyVersion πŸ”—︎ string

TaxonomyVersionπŸ”—︎

Specifies the version of the skills taxonomy to use. Defaults to V2, unless your account has access to V1. If you have access to V1, use v2 as the value for this property to explicitly set V2.

V1 is deprecated and will be removed in a future release.

Benefits of V2 include:

  • 2x larger skills taxonomy, updated frequently based on real-world data
  • 15-40% higher accuracy of extracted skills
  • Better clustering of skill synonyms
  • Distinguish skill types (IT / Professional / Soft)
  • Improved candidate summary
  • Compatibility with the taxonomy used in Textkernel's Skills Intelligence APIs and Jobfeed, enabling standardization of taxonomies across all of your data and benchmarking against jobs posted online.
ProfessionsSettings πŸ”—︎ object

ProfessionsSettingsπŸ”—︎

Enable normalization of job titles using our proprietary taxonomy and international standards.


ProfessionsSettings properties

Normalize πŸ”—︎ bool

NormalizeπŸ”—︎

When true, the most recent 3 job titles will be normalized. This includes a proprietary value from our profession taxonomy, plus ONET and ISCO mappings. Read more about the benefits of using a professions taxonomy.

When enabling professions normalization, additional charges apply.

The following languages are supported: Chinese (simplified and traditional), Croatian, Czech, Danish, Dutch, English, Finnish, French, German, Greek, Hebrew, Hungarian, Italian, Japanese, Norwegian, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Spanish, Swedish, Turkish. For documents in other languages, no normalized values will be returned.

For Textkernel Search & Match, normalized professions are automatically indexed and used when profession normalization is enabled during parsing (through IndexingOptions). To leverage profession normalization for user-created searches, enable profession normalization at query time.

The professions taxonomy and the mappings are compatible with the taxonomies used in Textkernel's Skills Intelligence APIs and Jobfeed, enabling standardization of taxonomies across all of your data and benchmarking against jobs posted online.

FlexRequests πŸ”—︎ object[]

FlexRequestsπŸ”—︎

Custom requests to ask during parsing. See the FlexRequests documentation for more details.


FlexRequests properties

Prompt πŸ”—︎ string required

PromptπŸ”—︎

The prompt to be sent to the LLM Parsing Engine

Identifier πŸ”—︎ string required

IdentifierπŸ”—︎

Unique field name to be returned alongside the reply in the response

DataType πŸ”—︎ string required

DataTypeπŸ”—︎

The data type for the reply. One of: text, numeric, bool, list, enumeration

EnumerationValues πŸ”—︎ string[]

EnumerationValuesπŸ”—︎

If DataType is enumeration, this is the list of possible replies. This is limited to a maximum of 50 values.

Sample JSON

json { "DocumentAsBase64String": "", "RevisionDate": "", "OutputHtml": false, "OutputRtf": false, "OutputPdf": false, "UseLLMParser": false, "OutputCandidateImage": false, "Configuration": "", "SkillsData": [ "" ], "NormalizerData": "", "GeocodeOptions": { "IncludeGeocoding": false, "Provider": "", "ProviderKey": "", "PostalAddress": { "CountryCode": "", "PostalCode": "", "Region": "", "Municipality": "", "AddressLine": "" }, "GeoCoordinates": { "Latitude": 0, "Longitude": 0 } }, "IndexingOptions": { "IndexId": "", "DocumentId": "", "CustomIds": [ "" ] }, "SkillsSettings": { "Normalize": false, "TaxonomyVersion": "" }, "ProfessionsSettings": { "Normalize": false }, "FlexRequests": [ { "Prompt": "", "Identifier": "", "DataType": "", "EnumerationValues": [ "" ] } ] }

Response BodyπŸ”—︎

Info πŸ”—︎ object

InfoπŸ”—︎

Explains the outcome of the transaction.


Info properties

Code πŸ”—︎ string

CodeπŸ”—︎

Code Description
Success Successful transaction
WarningsFoundDuringParsing Parsing was successful. This is not an error code. This is an advanced level message about the document, not about the parsing. For more information, refer to the ResumeQuality section in the parsed document output and to the documentation here.
PossibleTruncationFromTimeout The timeout occurred before the document was finished parsing which can result in truncation
Timeout The transaction reached its timeout limit
ConversionException There was an issue converting the document
MissingParameter A required parameter wasn't provided
InvalidParameter A parameter was incorrectly specified
AuthenticationError An error occurred with the credentials provided
Message πŸ”—︎ string

MessageπŸ”—︎

This message further describes the code providing additional detail.

Value πŸ”—︎ object

ValueπŸ”—︎

Contains response data for the transaction.


Value properties

ParsedDocument πŸ”—︎ string

ParsedDocumentπŸ”—︎

The parser results in JSON string format.

ScrubbedParsedDocument πŸ”—︎ string

ScrubbedParsedDocumentπŸ”—︎

This property is the Value.ParsedDocument with all of the Personally Identifiable Information (PII) fields such as first name, last name, email addresses, phone numbers, etc. scrubbed out.

FileType πŸ”—︎ string

FileTypeπŸ”—︎

The input file type that was submitted through the ParseResumeRequest.

Text πŸ”—︎ string

TextπŸ”—︎

The plain text of the parsed resume.

TextCode πŸ”—︎ string

TextCodeπŸ”—︎

A response code indicating the status of the conversion to text. See Document Conversion Result Codes for a complete list.

Html πŸ”—︎ string

HtmlπŸ”—︎

HTML version of the input file, if OutputHtml was set to true. Any HTML elements containing known PII will have the class tx-redacted. Below is some example CSS that can be used when displaying the HTML if you want to hide the elements containing known PII. Note that tech-saavy users would still be able to inspect the source HTML to see the redacted data. To fully prevent this, you would need to process the HTML and remove these elements prior to delivering the HTML to your users.

Sample CSS

css .tx-redacted, .tx-redacted * { background-color: black !important; /* make the background of the redacted elements black */ color: black !important; /* make the foreground of the redacted elements black */ pointer-events: none !important; /* disable any links in the redacted elements */ user-select: none !important; /* do not allow the user to select/copy text within the redacted elements */ }

HtmlCode πŸ”—︎ string

HtmlCodeπŸ”—︎

A response code indicating the status of the conversion to HTML. See Document Conversion Result Codes for a complete list.

Rtf πŸ”—︎ string

RtfπŸ”—︎

RTF version of the input file, if OutputRtf was set to true.

RtfCode πŸ”—︎ string

RtfCodeπŸ”—︎

A response code indicating the status of the conversion to RTF. See Document Conversion Result Codes for a complete list.

Pdf πŸ”—︎ string

PdfπŸ”—︎

The base 64 encoded string of the PDF version of the input file, if OutputPdf was set to true.

PdfCode πŸ”—︎ string

PdfCodeπŸ”—︎

A response code indicating the status of the conversion to PDF. See Document Conversion Result Codes for a complete list.

CandidateImage πŸ”—︎ string

CandidateImageπŸ”—︎

If a candidate photo was extracted, it will be output in this field as a base 64 encoded string of the byte array.

CandidateImageExtension πŸ”—︎ string

CandidateImageExtensionπŸ”—︎

If a candidate photo was extracted, the appropriate file extension for the photo will be output for this field (e.g. ".png").

FileExtension πŸ”—︎ string

FileExtensionπŸ”—︎

The recommended file extension of the input file.

CreditsRemaining πŸ”—︎ decimal

CreditsRemainingπŸ”—︎

The number of remaining credits is returned with every response. Please ensure that you set up monitoring of this value to ensure that you don't experience an outage by letting your credits reach 0.

GeocodeResponse πŸ”—︎ object

GeocodeResponseπŸ”—︎

If Request.GeocodeOptions.IncludeGeocoding is set to true (thus geocoding is executed), this object will be populated with a response.


GeocodeResponse properties

Code πŸ”—︎ string

CodeπŸ”—︎

Maps to the Response.Code parameter of a Geocode Transaction.

Message πŸ”—︎ string

MessageπŸ”—︎

Maps to the Response.Message parameter of a Geocode Transaction.

IndexingResponse πŸ”—︎ object

IndexingResponseπŸ”—︎

If Request.IndexingOptions contains any specified parameters, this object will be populated with a response.


IndexingResponse properties

Code πŸ”—︎ string

CodeπŸ”—︎

Maps to the Response.Code parameter of a Index a Document Transaction.

Message πŸ”—︎ string

MessageπŸ”—︎

Maps to the Response.Message parameter of a Index a Document Transaction.

ProfessionNormalizationResponse πŸ”—︎ object

ProfessionNormalizationResponseπŸ”—︎

If profession normalization was requested in the ProfessionsSettings.Normalize the status of the profession normalization transaction will be output here.


ProfessionNormalizationResponse properties

Code πŸ”—︎ string

CodeπŸ”—︎

Code Description
Success Successful transaction
Unhandled Exception Unhandled Exception
Message πŸ”—︎ string

MessageπŸ”—︎

A short human-readable description explaining the Code value.

FlexResponse πŸ”—︎ object

FlexResponseπŸ”—︎

Information about the FlexRequests transaction, if any were provided.


FlexResponse properties

Code πŸ”—︎ string

CodeπŸ”—︎

Code Description
Success Successful transaction
MissingParameter A required parameter wasn't provided
InvalidParameter A parameter was incorrectly specified
Message πŸ”—︎ string

MessageπŸ”—︎

A short human-readable description explaining the Code value.

Responses πŸ”—︎ object[]

ResponsesπŸ”—︎

If TimedOut is true, this is how much time was spent parsing before the timeout occurred.


Responses properties

Identifier πŸ”—︎ string

IdentifierπŸ”—︎

Unique field name assigned to the respective FlexRequest

Reply πŸ”—︎ string

ReplyπŸ”—︎

Reply to the FlexRequest

ReplyList πŸ”—︎ string[]

ReplyListπŸ”—︎

List of replies to the FlexRequest if the FlexRequest had a List DataType

Sample JSON

json { "Info": { "Code": "", "Message": "" }, "Value": { "ParsedDocument": "", "ScrubbedParsedDocument": "", "FileType": "", "Text": "", "TextCode": "", "Html": "", "HtmlCode": "", "Rtf": "", "RtfCode": "", "Pdf": "", "PdfCode": "", "CandidateImage": "", "CandidateImageExtension": "", "FileExtension": "", "CreditsRemaining": 0, "GeocodeResponse": { "Code": "", "Message": "" }, "IndexingResponse": { "Code": "", "Message": "" }, "ProfessionNormalizationResponse": { "Code": "Success", "Message": "string" }, "FlexResponse": { "Code": "Success", "Message": "string", "Responses": [ { "Identifier": "string", "Reply": "string", "ReplyList": [ "string" ] } ] } } }