Info GetFragment and GetTextMap

Posted by: simona.fratus on 10 October 2023, 3:21 am EST

    • Post Options:
    • Link

    Posted 10 October 2023, 3:21 am EST

    Hi!

    I’m trying to extract some text from a PDF using GetFragment but if I use standard X Y coordinates of the PDF area the output seems to be wrong, I get some words but not of the right area :frowning:

    If I know the rectangle where I want to extract the words how do I pass this information to the getFragment function in the right way? Is there some conversion which I should do first?

    Thank you very much!

    Simona

  • Posted 10 October 2023, 11:10 pm EST

    Hello Simona,

    These methods take position values in point, so please ensure you provide the values in this unit only. If you are still facing the issue, please share the pdf file with which you are facing this issue along with the points.

    Please see the following link to see how to get text from point:

    https://www.grapecity.com/documents-api-pdf/docs/online/parse-pdf-documents.html

    Regards,

    Prabhat Sharma.

  • Posted 11 October 2023, 2:48 am EST

    testPDFforGCpost.zip

    Lets say I have the PDF attached here, I know that the rectangle containing the “ELEMENTI RETRIBUTIVI” text is a RectangleF(25, 170, 20, 65), how do I convert that in points? Is it possible to extract that text?

    I know that that PDF has been written in a weird order, and maybe that could be a problem with the text extraction too, I don’t know.

    In the real case I need to extract text in other areas where the data are dynamic, but the process is the same :slight_smile:

    Thank you very much,

    Simona

  • Posted 11 October 2023, 11:16 pm EST

    Hello Simona,

    Thank you for the attached PDF.

    We too can observe the issue and discussing it with the developers further.

    We will let you know as soon as we get the update from their end.

    [Internal Tracking ID: DOC-5742]

    Regards,

    Prabhat Sharma.

  • Posted 11 October 2023, 11:37 pm EST

    Nice to know, thank you very very much! :slight_smile:

  • Posted 15 October 2023, 5:50 pm EST

    Hello Simona,

    As per the developers, the HitTest method is not intended for this scenario. Please use ITextMap.GetFragmentFromRect(RectangleF) method for this case (it is an extension method)

    var g = doc1.Pages[0].Graphics;
    var rect = new RectangleF(25, 170, 20, 65);
    var tmf = tmap.GetFragmentFromRect(rect);
     Console.WriteLine(tmap.GetText(tmf));

    If you need any other help, please feel free to ask.

    Regards,

    Prabhat Sharma.

    GetFragment Demo.zip

  • Posted 15 October 2023, 8:26 pm EST

    Thank you very very much!! :smiley:

Need extra support?

Upgrade your support plan and get personal unlimited phone support with our customer engagement team

Learn More

Forum Channels