In this tutorial we build a real life scenario, which you can use practically as it is in your business environment.
During building this scenario, we learn how to
To create the necessare service, is really very simple.
We create a file called pdfconv.py
:
from flask import Flask, request |
What happens here:
request.data
bytes
object, which is not suitable for PyPDF,
so we convert it to a file-like objectYou can download the code from here.
In a productive business environment you need to deploy it either in a cloud provider,
or on one of your on premise hosts e.g. as a Docker container.
In this tutorial we simply run it from the command prompt
You know already from the previous tutorials, that in order to receive any
HTTP requests in SYGR, we need to define a Message Converter Plugin.
In this case it will be an asynchronous, file plugin.
The file API of SYGR requires a Multipart form data media type input
with only one item, where
"file"
It then has to be sent to the file API, using URL
http(s)://[host]:[port]/[SYGR server name]/services/Rest/file/async/[plugin name]/default
The Plugin API provides the file in
data.messageIn.getTransfer().get(0)
as a simply Object, so we need to convert it to byte[]:
if(data.messageIn.getTransfer() == null || data.messageIn.getTransfer().size() == 0) { |
SYGR provides the possibility to store (even large) binary files in the SYGR databases.
Here we call these Attachments.
In the previous tutorial we have learned the role of the Store Types.
Store Types are a logical grouping of SYGR and database servers.
If we save something into a Store Type, that will be available
String storetype = util.getStoreTypeOfPotType(Constants.ENTPROTOPO);
|
The we create an Attachment Object, add the file to it and save:
Attachment atta = new Attachment(); |
The save method returns a unique ID of this attachment, which we store in the Creator we send to
asynchronous processing. This way the Entity Init Plugin (most probably in another server
running on another host) will find it.
We also set the name of the Entity Plugin, this will be executed automatically
before the Entity is saved.
creator.setInitplugin(Constants.INITPLUGINPO);
|
The Entity Init Plugin runs after the prototype of the Entity arrived
from the Catcher Server to one of the Store Servers, before actually saving is
into the database.
Prerequisite is, that the field initPlugin
is filled with the name
of the required Plugin in the Creator
Object within the MessageConverter.
In our case the Entity Init is doing quite a lot of work, let's see step by step.
First of all we need to get back the PDF file which we have saved as Attachment in the Message Converter.
For this, we first need to read the GUID of the attachment from the Entity and then read the Attachment.
To be nice, we also delete the already unnecessary Attachment from the database, and convert the content
of it to our PDF file.
Remarks:
Attr attr = data.pot.getFlexi(); |
We have prepared a nice Web Service, which receives a PDF and extracts its text content.
Now it's time to use it.
For this we
String uri = Util.getPdfTxtUrl(util); |
We have seen this util.execCommand()
earlier, it's a Jolly Joker
for command execution.
In our case we call the HTTP request sender. The parameters here:
Parameter position | Purpose | Content |
---|---|---|
1 | Main Command | "HTTP" for HTTP request sending |
2 | Request format | "BINARY" for a direct binary file sending |
3 | Expected response format | "TEXT" for any text type response |
4 | URL | Request URL |
5 | HTTP headers | The headers we want to send |
6 | BODY content | Its format depend on the request type, in this case simply the PDF file in a byte[] |
The response String we get simply as the 1st Object in result
, but we need to convert it
to an actual String object.
Object retobj = result.get(0); |
The 3rd line in this code looks strange. Why do we need it?
The reason is, that the GPT API very, but very much hates New Line characters.
It actually dies at once it receives one (not the complete OpenAI company, but your request).
However later, the SYGR GPT interface also takes care of it, never hurts to be careful.
The next step is, that we create a nice, so called Few Shot Prompt for GPT. In this
Text: [the PDF text]\\n |
We request the answer in an easy machine readable text format.
We could use anything else, but by our experience JSON can be problematic,
because GPT itselfs sends the response as JSON, and then our JSON
will be a field value within that JSON, which confuses the parsers.
But feel free to experience.
With this prompt we just call GPT:
ArrayList<Object> alo = util.execCommand("AI", "CHATGPT", "SIMPLE", "aigpt", prompt); |
Please see, that we read the second returned object.
The first one (alo.get(0)
) is the complete JSON answer from GPT, which also can be useful sometimes. The 2nd is just the answer.
Not too much left. We just
ExtAlert mail = new ExtAlert(); |
In the SYGR configuration there is nothing new.
We need 2 Entity Models: one is used by the Message Converter, the other is the final Model set by the Entity Init.
(Do not forget to link both to a Store Type!)
And we also need two Plugin definitions, one for the Message Converter and one for the Entity Init.
(And here do not forget to link the Message Converter Plugin to a Catcher Server!)
We are done. It was not so easy as the previous tutorials, but the result shows it was worth.
We have prepared a not too simple PDF document for our Purchase Order:
We send it to SYGR from Postman:
We have received an e-mail:
We can open the document in SYGR:
You can download the different codes:
We have arrived to the end of this tutorial.
We have learned how to convert an incoming PDF file to an actual document using different techniques (own Web service, GPT).
See you in our next tutorial, where we go into more details of the SYGR Automation System.
If you have questions, please contact us:
contact@sygr.ch
contact@sles-automation.com
+41 79 470 67 84