GcPdf v4 - December 10, 2020

Extract Table Content from PDF Documents

You have over 500 PDF invoices to extract table data into a different format for analysis. Where do you begin? The office suites like Microsoft Office, LibreOffice, and OpenOffice have limited functionality to extract data efficiently from PDF documents. Though each can export to a PDF, they lack the functionality to pull data from PDFs in an organized fashion.

The first thought is, this is easy; it's just a copy and paste from each of the documents. But what if you have thousands of tables to copy data from? In cases like this, it is much easier, more efficient, and cost-effective to use an automated tool to do the whole job. GrapeCity Documents for PDF (GcPdf) adds the support required to extract table data from PDF documents through the GetTable(..) method. The method accepts several parameters to help ease the process of garnering data from these files.

With this tool, you can create applications automating the task of scanning PDF documents with different data but similar layouts and extracting the table data out of them with ease.

View this blog that extracts invoice table data from a PDF using GcPdf, and copies it to a CSV file. Further, you can use GcExcel .NET Excel API to import the CSV and modify the table formatting.

Examples of extracting table data from PDF files with GcExcel .NET Excel Library by GrapeCity

Examples of extracting table data from PDF files with GcExcel .NET Excel Library by GrapeCity

View Help | Demo

GrapeCity Documents PDF Viewer (GcPdfViewer)

Collaboration Mode

Now that many paper documents have been converted to a digital format, the idea of collaboration and sharing takes on new meaning. Documents need to be shared and, at times, edited online by multiple users, ultimately saving time and effort for the users. However, most organizations that share documents use the PDF format as their preferred sharing format. GcPdfViewer easily provides this functionality online, allowing multiple people from multiple geographic locations the opportunity to edit, comment, change, and add data as necessary. Providing this functionality increases many companies' ability to easily collaborate on multiple documents across many domains, increasing productivity, accuracy and saving time and money.

GcPdfViewer has the following features ready to go for sharing documents with a team:

(Note that this feature will require configuring the SupportApi project that runs GcPdf on the server to save the changes and send it back to the client. This process will require a GcPdf license to run and a GcPdfViewer Professional license.)

1. Share documents from the viewer through a simple button on the toolbar.

PDF Editing collaboration example with GcPdfViewer by GrapeCity

2. Select the user with whom you want to share the document.

Example of setting up collaborators to work on editing a PDF with GcPdfViewer by GrapeCity

3. Set the permissions on the document like 'view' or 'view and change'.

Set security for collaboration tools while editing PDF files with GcPdfViewer by GrapeCity

4. Simultaneously view which user is modifying the document.

See edits and comments to your PDF file in real-time with GcPdfViewer by GrapeCity

5. Change the permission of the users by the person who assigned it.

Change permissions of users on the fly with GcPdfViewer by GrapeCity

6. Edit the document, add annotations, form fields, comments, and share it with other users.

Add a variety of comments, annotations, form fields, and other content and share with other users with GcPdfViewer by GrapeCity

7. Allow/Deny unknown users to the server

To enable the feature of sharing documents in GcPdfViewer, developers set the 'sharing' option in JavaScript code and dis-allow usernames unknown to the server with 'disallowUnknownUsers' property or set 'knownUserNames' to allow specific users.

var viewer = new GcPdfViewer("#root", { userName: "John", sharing: { disallowUnknownUsers: true }, supportApi: "api/pdf-viewer" } );
            
var viewer = new GcPdfViewer("#root", {   
            userName: "John", sharing: { knownUserNames: ["Jaime Smith", "Jane Smith"], disallowUnknownUsers: true, }, supportApi: "api/pdf-viewer" });
            

Find more details on how to turn on the sharing mode and set sharing options in the Viewer:

View Help | Demo

New Custom HTML5 Input Types in PDF Forms

Online forms today have become increasingly more complex and collect much more data than ever before. Often these forms require user information like; date of birth, telephone number, URL, email, etc. However, such input types are not part of the standard PDF specification.

Example of supported input types for GcExcel .NET Excel Library and GcPdfViewer, supporting custom HTML5 inputs by GrapeCity

  • Text
  • Date
  • Time
  • Telephone number
  • Email ID
  • URL
  • Password
  • Month
  • Week
  • Number
  • Range

Together with these new input types, you can specify the following settings -

  • Autocomplete
  • Autofocus
  • Required
  • Spell check
  • Min/max length
  • String/number patterns
  • Default Values
  • And more

Following is an example of a Lease Agreement PDF form created using GcExcel API and viewed/filled in GcPdfViewer -

Example lease form using GcExcel .NET Excel Library and GcPdfViewer to add, fill and submit custom HTML5 controls by GrapeCity

Adding Custom Input Types in PDF Documents

GcPdf allows custom properties for text fields using the GcProps dictionary. The GcPdfViewer uses these properties to provide additional UI and validation for the text fields. The properties and validation settings are the same as those supported in GcExcel (see below topic).

Here is how you add a Date Field to a PDF form -

var doc = new GcPdfDocument();  
            var page = doc.NewPage();  
            var g = page.Graphics;  
            TextFormat tf = new TextFormat();  
            tf.Font = StandardFonts.Times;  
            tf.FontSize = 14;  
            var field = new TextField();  
            field.Widget.Page = page;  
            field.Widget.Rect = new RectangleF(40, 40, 72 * 3, 24);  
            field.Widget.TextFormat.Font = tf.Font;  
            field.Widget.TextFormat.FontSize = tf.FontSize;  
            field.Name = "Start Date";  
            field.GcProps["title"] = "Start Date";  
            field.GcProps["type"] = "date";  
            field.GcProps["required"] = true;  
            field.GcProps["validationmessage"] = "The date cannot be empty.";  
            doc.AcroForm.Fields.Add(field);
            

This is how it appears in the PDF form when viewed in PDF Viewer.

Example of PDF form with date/time field added by GcExcel .NET Excel Library and GcPdfViewer combined by GrapeCity

Read more about how to add custom input type fields in PDFs with the GcPdf API. View the full list of supported input types and settings.

NOTE: these input types are not supported in the standard PDF specification and can only be viewed/filled in GcPdfViewer.

How to Add Custom Input Types with GcExcel Templates

If you have a GcExcel license, you can also add new custom input types through Excel templates and generate PDF forms. The custom input types supported are the same as GcPdf.

Example of creating a custom input type for PDF form using GcExcel .NET Excel API and GcPdf by GrapeCity

Example of creating a custom input type for PDF form using GcExcel .NET Excel API and GcPdf by GrapeCity

Once a PDF form is generated, you can view the PDF form in our JavaScript-based GrapeCity Documents PDF Viewer GcPdfViewer. Learn how to add custom input type fields in PDFs with GcExcel .NET Excel API and visit the demo.

View Full List of Supported Custom Input Types and Properties

View Help | Demo

PDF Form Filler

It is often necessary to add extra information to an existing form using fields and/or validation messages. However, the underlying form cannot be modified. How can this be done in such a way to keep the underlying form, add validation to ensure data integrity, and ensure the forms can quickly be filled out on multiple devices (large and small)? With the introduction of the PDF Form Filler dialog, it is possible to customize field labels, fine-tune the input controls' behavior, and add additional input validation, even if the PDF does not have inline validation or field label information. It is also possible to fill in values in the fields to populate data in the underlying PDF.

Note: To save the filled PDF form on the client, it will require configuring the SupportApi project that runs GcPdf on a server to save the changes and send it back to the client. This process will require a GcPdf license and a GcPdfViewer Professional license to run.

Example of tennant application using Form Filler, GcPdf, and GcPdfViewer by GrapeCity

The new input types can be added via GcPdf or GcExcel .NET Excel API to a new PDF form or a third-party PDF form. The field properties and validation settings are customized through the Form Filler dialog.

The code below demonstrates how to turn on the option for Form Filler and access Form Fields under the 'mappings' section –

function loadPdfViewer(selector) {  
                        var options = {};  
                        options = setupFormFiller(options);  
                        var viewer = new GcPdfViewer(selector, options);  
                        viewer.addDefaultPanels();  
                        viewer.toolbarLayout.viewer = {  
                            default: ['open', 'form-filler', '$navigation', '$split', 'text-selection', 'pan', '$zoom', '$fullscreen', 'print', 'title', 'about'],  
                            mobile: ['open', 'form-filler', '$navigation', 'title', 'about'],  
                            fullscreen: ['$fullscreen', 'open', 'form-filler', '$navigation', '$split', 'text-selection', 'pan', '$zoom', 'print', 'title', 'about']  
                        };  
                        viewer.open("hotelbooking.pdf");  
                    }
            
                    function setupFormFiller(baseOptions) {  
                        var options = baseOptions || {};  
                        // Form Filler options:  
                        options.formFiller = {  
                            mappings: {
            
                                'AppDate': {                                  
                                                title: 'Application date',  
                                                displayname: 'Date',  
                                                type: 'date',  
                                                defaultvalue: new Date().toJSON().slice(0, 10)  
                                            },  
                            }  
                        };  
                        return options;  
                }
            

Key Features:

Customize UI Appearance of Form Fields

Example of customized UI for a responsive form using GcPdfViewer by GrapeCity

Modify Validation Messages for Form Fields

Example of using form validation with GcPdfViewer by GrapeCity

Form Event Handlers and Settings

You can use:

  • onInitialize - onInitialize event handler is called after the list of fields is loaded and initialized but not yet rendered.
  • beforeApplyChanges - this event handler is called when the Apply button is clicked after a successful fields validation.
  • beforeFieldChange - event handler is called right before the field value changed.
  • mappings {} - control the appearance, behavior, and validation settings for the input field inside the Form Filler dialog with Form Fields mappings, key - field name, value
  • validator? - the common "validator" function is called for each field before saving changes or on user input when field mapping settings contain a validateOnInput flag.
  • And more
Define Custom Content

If you want to add formatted information to your PDF form, either as simple HTML content or an HTML table, you can add a field with 'type' as 'custom-content' and add HTML code for the 'content' property.

    'CustomContent_Info1': {
            
                            type: 'custom-content',
            
                            content: `<table>
            
                                        <tr><td style='vertical-align:top;'>  
                                            <i><u>Corporation:</u></i>  
                                        </td><td style='vertical-align:top;'>  
                                            <i>Articles of Incorporation must be provided and 2 years' Annual Report and corporate tax return.</i>  
                                        </td></tr>  
                                        <tr><td style='vertical-align:top;'>  
                                            <i><u>Partnership:</u></i>  
                                        </td><td style='vertical-align:top;'>  
                                            <i>Partnership Agreement must be provided plus individual partners' current personal financial statement and 2  years' personal tax returns.</i>  
                                        </td></tr>  
                                        <tr><td style='vertical-align:top;'>  
                                            <i><u>Individual:</u></i>  
                                        </td><td style='vertical-align:top;'>  
                                            <i>Personal balance sheet and 2 years' personal tax returns must be provided. Must include Drivers' license number.</i>  
                                        </td></tr>  
                                    </table>  
                        },
            
            

This is how the custom content appears in the Form Dialog.

Custom content created by GcPdf and Form Filler by GrapeCity

Fill Forms on Small Devices

Small screen size is more problematic as the size of forms or the information collected increases. The PDF form filler uses various formats and allows a responsive design to gather the form's information without displaying the full PDF. The data is collected in a list-like format, then applied to the underlying PDF file and saved appropriately.

Example of form data collection on a small device using responsive design and GcPdfViewer by GrapeCity

Read more about PDF Form Filler support:

Help | Demo

Convert Annotation and Form Fields to Content Elements

Annotation and form fields used in PDF documents/forms can now be converted to content elements when the PDF document is saved. When you save the PDF document, all components (annotations and fields) marked for content conversion will be converted to page content. Original elements will be removed from the PDF document.

Use the "Convert" button to apply the conversion to content.

Example of conversion of annotations and content into PDF file using GcPdfViewer by GrapeCity

Once you have marked an element for conversion, you cannot edit it.

Use the "Revert" button to undo the content conversion.

Example of reverting conversions to the original content.** **Done by using GcPdfViewer by GrapeCity

View Help | Demo

Change Default Values of Annotations and Form Fields

Changes can be made programmatically through code to the default values of annotations and form fields added from the toolbar. The following code changes the border width and interior color of the square annotation.

var viewer = new GcPdfViewer("#root", {  
                   editorDefaults: {  
                       squareAnnotation: {  
                           borderStyle: { width: 5, style: 1 },  
                           color: '#000000',  
                           interiorColor: '#ff0000',  
                       }  
                   },  
                   supportApi: "api/pdf-viewer"  
              });
            

View Demo

Introducing GcPdfViewer Professional Version

With the Professional Version of GcPdfViewer (additional licensing for Professional Version required, read more details here), v4 now supports editing PDF documents. The viewer needs to be connected to GrapeCity Documents for PDF (GcPdf) on the server to enable GcPdfViewer's editing features via the SupportApi property. Check out more details on configuring the viewer for editing tools.