Learn how to use Transformers
How to transform the data extracted by the extractors.
Transformers help in downstream processing of the preliminary extraction results coming from the extractors. These let the user apply some basic transformations to the extracted values.
Several transformers are processed in sequence. The transformers will be processed before the verifiers.
|
Transformer
|
Description
|
|
Number: Money Amount
|
Removes common currency abbreviations and symbols, and handles thousands separators.
|
|
Number: Only Floats
|
Remove everything that is not an decimal value.
|
|
Number: Only Integers
|
Remove everything that is not an integer value.
|
|
Number: Only Numbers
|
Remove everything that is not a number.
|
|
Number: percent to decimal
|
Conversion of a percentage value to a decimal value, e.g. 72% -> 0.72
|
|
String: Date
|
Parses the prediction string in the candidates to return valid dates
|
|
String: Datetime
|
Parse the Candidate value for valid datetimes and return them in the format yyyy-mm-ddThh:mm:ss.
|
|
String: Only Characters
|
Concatenates only the characters in the prediction and strips off the digits.
|
|
String: Search Regex
|
Regex search on the candidate to extract the value using regular expressions.
|
|
String: Strip String
|
Removes white spaces.
|
|
String: Substitute
|
Searching for a value using regex and replace it with an fixed value.
|
Transformer "String: Strip String"
Removes white spaces.
Transformer "String: Search Regex"
Searches for an input regex pattern, in the text of the candidate and if not found, retains the total text, or else returns the first matched subtext.
Transformer "String: Substitute"
Searching for a value using regex and replace it with an fixed value.
Transformer "String: Date"
Parses the prediction string in the candidates to return valid dates. A date is always divided into three bucks.
Transformer "String: Datetime"
Parse the Candidate value for valid datetimes and return them in the format yyyy-mm-ddThh:mm:ss.
Transformer "Number: Only Numbers"
Remove everything that is not a number. The consideration if it is a number is done per token.
Transformer "Number: Only Integers"
Remove everything that is not an integer value. The consideration if it is a number is done per token.
Transformer "Number: Only Floats"
Remove everything that is not a float value. The consideration if it is a number is done per token.
Transformer "Number: Money Amount"
Removes common currency abbreviations and symbols, and handles thousands separators.