Frequently Asked Questions

Please use the question form to send your questions or comments.
See the video tutorials on YouTube that explain various features of TExtract.


1. How do I create an index with TExtract?

These are the basic steps when creating an index manuscript with TExtract:
  1. Drag-and-drop a PDF version of the publication onto TExtract, or use Project ► New.
  2. Complete the pages in the Create Initial Index wizard; press Finish to automatically create the initial index.
  3. Then press Edit to open the editor. You can navigate, inspect, select, edit and add entries and occurrences in various ways. To see the resulting formatted index press the Preview tab at any time. When you are ready editing, close the editor.
  4. Press Export, then press Ok. The formatted index is saved as an RTF file. This can be opened in any publishing software, such as Microsoft Word.
  5. Close TExtract. Review and edit the index as needed in your publishing software.
  6. Then include the index in the source document, or send it to the publisher.

See the video tutorial (6 minutes).

A short manual is available in TExtract under Help ► Manual. Print out the list of editing commands under Help ► Command chart. Hints on the screen will suggest available commands.


2. What is the difference with other indexing software?

TExtract uses a combination of automated and manual indexing, enhanced by linking of the index with the text. No other back-of-book indexing software has this. For details see differences with other indexing software.


3. What is the difference between the Standard and Pro license versions?

The Standard license has all the features you need in many situations: PDF or plain text input (single-file or multiple files), automatic creation of the initial index, in-context index editing, editing multiple indexes simultaneously (e.g. general and names indexes) and application of filters. Output is in RTF, HTML, LaTeX or plain text format, and fully customisable. Alternatively, you can use embedded indexing of Word .docx documents.

TExtract Pro adds powerful revision and republishing features such as document replacement and re-using existing indexes. It also supports exporting the index into an EPUB version of the book, merging indexes for volume series (master indexes), application of authority files, tabbed table output format, multiple-user editing of large projects, the option to index only bold and italic text, and the keyword allocation facility.

For details see versions and prices.

In TExtract, see Help ► Registration ► license info to find out which Pro features you have used in your trial project, if any. Press the Help button there for information about the features.


4. What is the Single Publication (SP) license?

The SP licenses are affordable versions of the permanent licenses, intended for users (usually authors) who need TExtract just for one publication. An SP license is exactly the same as a normal Standard or Pro license, except that after three months you can no longer create new projects; you can keep working on existing projects.

After installing TExtract, you can use the trial license and start working on your index. The trial license will allow previewing and exporting the initial section of the formatted index, up to letter H. You can edit and export the index as many times as needed. To enable previewing and saving the full formatted index, purchase an SP license.

Note that using a Pro SP license you will be able to update and re-use the index in revisions or next editions of your publication with little extra work. For details see faq 12.


5. How do I get a license?

First download and install TExtract if you haven't done so yet. After running the setup program you have a 30-day trial license. This is a fully functional Pro license, except that when you format the index just the first part of the formatted index is shown, up to letter H.

To upgrade to a true license, start TExtract and in its main menu choose Help ► Registration and copy the serial number shown. Then go to the ordering and registration web page to purchase a license. Enter the serial number where requested.

After completing the order you will receive a license key by email. Copy the key, then in TExtract use Help ► Registration ► Enter license key and paste the key.

Note: When purchasing the license you do not have to download the software again. The license key will unlock features of the installed software.


6. What are the input and output options?

Input is in PDF, Word .docx, or plain text format:
  • When using PDF, TExtract will extract the text and page numbering from it and generates the index from that. The formatted index will be saved as a separate file, to be included in the source document.
  • When using a Word .docx file as input the index entries will be embedded in a copy of the file; for details see faq 16.
  • You can use plain text (.txt) input if it contains some type of page breaks.
  • You can use a single file or a set of files (e.g. chapters of a book, or volumes of a series).

Output is in RTF, Word, HTML, EPUB, LaTeX or plain text format:

  • RTF is a general purpose text format that can be read by most text processing and publishing software (e.g. Word).
  • When using .docx input the output is a copy of the file with the index entries embedded; see faq 16.
  • The HTML output format generates the index as a web page with links to document pages.
  • The EPUB export function integrates the index in the EPUB edition of the book; page references are linked to paragraphs. See faq 15 for details.
  • LaTeX is a specialized format used in academic environments.
  • Exporting in plain text or tabular format is used for special purposes such as further processing of indexes in other software.

The output is fully customizable. For details, in TExtract see Help topic "Formatting and exporting."



7. How do I create a PDF version of my document?

TExtract needs a PDF version of the document. You can use a single PDF file, or a set of PDFs. To create PDF:
  • In your publishing software use Print > Print to PDF to save the document as PDF. You may need to install a PDF conversion utility to enable this option.
  • Or use Adobe Acrobat.
Please note:
  • Do not use password-protected PDF. TExtract cannot extract text from it.
  • Every PDF page must contain one document page. Do not use double-paged PDF.
  • Using PDF editing software to modify or edit the PDF file after it has been created may sometimes cause font problems when text is extracted. It is best to use an "original" PDF file.
  • When using PDF created by a scanner, make sure the scanner software is set to create PDF containing text rather than images.
  • If the document is large (larger than say 50Mb) and contains many images, you may want to create a PDF file with the images grayed out, if your PDF creation utility enables this. In the PDF file the images will then be shown as grayed-out blocks. This will greatly reduce the size of the PDF file, making it easier to handle.


8. Can I run TExtract on a Mac?

The short answer is no, TExtract is Windows software. Fortunately the long answer is yes, TExtract runs fine on your Mac under Boot Camp, or under Parallels Desktop or VMware Fusion.


9. How do I create separate subject and name indexes?

When editing the index you can create multiple indexes at the same time by assigning an index number to entries. When the index is formatted only the entries with the currently selected index number will be included. Here's how to proceed:
  1. Create a new project as outlined in the basic steps.
  2. On the Index type page of the wizard choose general index.
  3. Press Next, Finish.
  4. When the initial index is ready press Edit to open the editor.
  5. In IndexView choose Extra ► Multiple indexes ► Add index. An additional column with caption # (number sign) at the left side of the entry list shows the index number for each accepted entry. Entries you have already accepted have index number 1. Also shown is the Current index selection item at the top of the entry list. Select the number of the index you want to edit next. Entries you will be accepting are assigned that number.
  6. When you preview or export the index only entries with the Current index number are included. To format another index adjust the Current index setting in IndexView, or close the editor, press the Export button in the main menu and select the index number.
For additional info see the system Help index topic 'multiple indexes'.


10. I have a pre-defined set of index terms. Can I use it?

Yes you can. A pre-defined set of index terms is known as an authority file. TExtract provides powerful facilities for processing authority files, enabling full control of vocabulary and references. Furthermore, using an authority file, your index can be ready in a short time. Note that creating and testing an authority file will take time. If you don't have an authority file, creating the index from scratch in TExtract is recommended.

For details see the system Help info about authority files. Please note: using authority files is a Pro license feature.


11. Not all of the document is available yet. Can I create the index now?

Yes. You can use a plain text manuscript or a PDF proof to create the index, and later replace the document with the final version. See the next faq about revised document versions


12. Can I re-use the index when I get a revised document?

Yes you can. You can replace a document by a revised version. The index is updated automatically. This enables creating the index before the final text version is ready, and it supports re-using the index when preparing a new edition. If there are only layout changes and minor text corrections no additional work is needed. Here's how it works:
  1. Place the revised PDF in the folder containing the TExtract project file of the previous version. Use a clear name such as "Jones2023-final.pdf".
  2. Use Project ► Open to open the project of the previous version if it is not open yet.
  3. Choose Project ► Replace document. Select the revised document, press Ok. (If the document consists of a set of pdfs, select the full updated set of pdfs.) A new version of the project is created, and the index is updated. (The old project is not changed.)
  4. If the revised document contains a significant amount of new text that must be indexed, press Create, then press Finish.
  5. Press Edit to edit the revised project as needed. The editor highlights changed text and indicates new entries and references.
  6. Press Export to export the formatted index.

If there is just a small amount of new text you can skip step 4; you can add entries and references manually in the editor if needed. If there are just layout changes or minor corrections and no new indexable text you can skip step 5 as well.

For more info, see system Help index topic Revised documents. Please note: this is a Pro license feature.

See also: If you have a previous edition of the book containing an index, but no TExtract project exists for that edition, see the next faq.


13. I have the index of the previous edition. Can I re-use it?

You can. TExtract enables you to very efficiently re-use the index of a previous edition of a publication, even if no TExtract project exists for it (if a project exists see faq 12). The basic procedure is this:
  1. First you need to create a plain text copy of the index of the previous edition.
  2. Then drop the PDF of the previous edition of the book onto TExtract to set up a project for that edition.
  3. Next choose Project ► Import ► Import a plain text index and import the index.
  4. Then replace the document by the new version (see faq 12); the index is updated.

In the main menu choose Project ► Import ► Re-use existing index (wizard) to bring up a wizard that will guide you through the procedure.

For full details, in the system Help Contents see Special topics > Reusing the index from a previous edition. Please note: this is a Pro license feature.



14. How to re-use the index from the printed edition when republishing as an EPUB ebook

TExtract supports exporting an index into an EPUB edition, for new and existing publications. Once the index for the print version is available, exporting to the EPUB version takes just a few minutes. There are three scenarios A, B and C as described below. (For details, in TExtract see Help index topic "EPUB".)

A. Indexing a new publication: see faq 15.

B. Re-publishing an existing publication for which no TExtract project exists

There are two versions of this scenario:

B1. If the printed edition is current, you need to have a PDF of it available, and you need a plain text file containing the index. Then proceed as follows:

  1. Create the project using the PDF of the printed edition. Do not edit or export.
  2. Next choose Project ► Import ► Import a plain text index and import the plain text index.
  3. Then export the index as explained in faq 15.

B2. If the printed index is for a previous edition of the publication:

  1. Re-use the index of the previous edition as explained in faq 13.
  2. Then export the index as explained in faq 15.

C. Re-publishing an existing publication for which a TExtract project exists

If the project is for the current edition, see faq 15. If the project is for a previous edition first replace the document (faq 12).

Please note: this is a Pro license feature.


15. How to export the index into the EPUB edition of a publication

TExtract can export the index created for the print edition of a publication into the EPUB edition. The EPUB export function integrates the index in the EPUB file. In the EPUB index, references are linked to the exact text paragraphs. Here's how it works.

You need to have the .epub file of the book available. Best use the same version of the book as used to create the PDF for the print edition index, i.e. a full version without the index. Place the .epub file in the folder containing the TExtract project for the print edition index. Then:

  1. Open the TExtract project of the print edition index, if it is not open yet.
  2. Drop the .epub file onto TExtract. The export will be set up. This will take a few moments.
  3. When prompted, press Export, then press Ok to create a copy of the .epub file with the index integrated.

If the input EPUB file is named "Smith2023.epub" the new EPUB file is "Smith2023-withindex.epub".

See the video tutorial (5 minutes).

Notes:

  • An EPUB edition of a book is created using publishing software such as Adobe Indesign. A free alternative is Calibre.
  • To export the index into a MOBI ebook for Kindle, you can convert the MOBI version to EPUB, then export the index into that, then convert back to MOBI format. Converting from MOBI to EPUB and back is easy to do using Calibre.
See also: Read about the different scenarios for new and republished books in faq 14.

Please note: Exporting to EPUB is a Pro license feature.


16. Indexing a Word .docx document (embedded index)

Here's how to use TExtract to create an index on a Microsoft Word .docx document and embed the index entries in a copy of the document. This will enable you to use all of TExtract's indexing power if your publisher prefers an embedded index instead of a separate index manuscript. It is a three-step procedure; steps 2 and 3 take just a few minutes:

  1. Start TExtract, drag-and-drop the .docx file onto it, and follow instructions shown. When prompted, press Edit to open the editor. You can navigate, inspect, select, edit and add entries and occurrences in various ways. To see the resulting formatted index press the Preview tab at any time. When you are ready editing, close the editor.
  2. Press Export to embed the index entries in a copy of the .docx input document.
  3. Open the copy in Word and generate the formatted index using Word's Insert Index command.

For details in TExtract see Help index topic "indexing a Word .docx document."

See the video tutorial (6 minutes).

Notes:

  • Contact your publisher before using embedded indexing. Publishers often prefer an index manuscript, created using a PDF proof.
  • To create an index manuscript on a Word .docx document, save it as a PDF and create the index from the PDF.
  • For more on index manuscripts and embedded indexes see the next faq.


16.1. Index manuscript or embedded index?

The standard way to create an index for a publication is to write an index manuscript. This is a separate document containing an entry list with page or paragraph references. When using page references, to get the pagination right the index manuscript must be based on a final proof version of the text (but see the faq on index revision). TExtract supports creating an index manuscript from a proof in PDF format. When it is completed, the formatted index (the index manuscript) can be included in the source document (often a Word or InDesign file) as the last chapter, and the final print can be generated.

An alternative approach is to embed the index entries in the source text, at the locations where the references must point to, and have the publishing software generate the index, from the embedded entries. This has two advantages: the references are automatically updated if there are text or page layout changes, and the index can be generated for other media besides print. However, it has always been quite laborious to manually create an embedded index. Fortunately, TExtract can now be used to create an embedded index on a Word .docx document. Using TExtract, embedded indexing of a Word document is no more work than creating an index manuscript. For details see the previous faq.

Notes:

  • Contact your publisher before using embedded indexing. Publishers often prefer an index manuscript, created using a PDF proof.
  • Word's Insert Index command offers limited formatting options. For better control over the formatting of the index, use a PDF as input to create an index manuscript, using TExtract's own formatting preferences.
  • To create an index manuscript on a Word .docx document, save it as a PDF and create the index from the PDF.
  • Note that you do not have to use embedded indexing to be able to reuse the index of an earlier text version. Using TExtract, after creating an index manuscript you can replace the document with a revised PDF version and have the index manuscript updated automatically.


17. How do I create a merged index for a series of volumes?

TExtract provides an easy facility for making a merged index (or master index) for a series of projects. This is used in case a publication consists of a growing series of volumes. It is also great if you have a series of publications for which you need seperate indexes as well as a merged index. You can also use it to have a set of chapters indexed separately and then merge indexes.

Here's how it works. First, the available volumes are indexed as separate projects. After the individual indexes have been created and edited, the merged index can be set up and exported:

  1. Create and edit each individual project, as described in faq 1.
  2. Open the first project. This will hold the list of volumes (projects) for the series.
  3. Choose Project ► Volume series ► New.
  4. Add the projects (select .tpro files).
  5. Set volume numbers if needed.
  6. Press Ok.
  7. Now to export the merged index press Export, check Series index, press OK.

You can edit the individual projects and re-export the merged index, as needed. You can add volumes to the series as they become available, and export the merged index again. Create and edit the index for the new volume, then re-open the first project, choose Project ► Volume series ► Volume list to add the new project. Then export the merged index again.

Note: Using the procedure above you will edit each volume separately. In case you intend to create an index for a single document consisting of a fixed set of files, and if all files are available and no indexes per file are required, it is generally recommended to create a single project, to index and edit all files at once. To set up the project choose Project ► New and select all pdfs.

For more info see the system Help topic Creating a merged index for a series of volumes.

Please note: this is a Pro license feature.


18. How to import work from and export work to other indexing software.

To import an index created using other indexing software into a TExtract project you need to create a plain text version of the index and in TExtract use Project ► Import ► Import plain text index. For details see Help index topic "importing an existing index." This feature e.g. enables very efficient index revision; see faq 13 for more. This also enables exporting an existing index into an EPUB edition of the book; see faq 14-B.

To export an index from TExtract for importing into other software use Export ► Tabbed table. In the Preferences ► Markup page check By reference to create an entry for each page number. The index is saved as a .tab file. For details see Help index topic "tabbed table output format."

Please note: this is a Pro license feature.


19. Can indexers work together to reduce production time?

Yes. You can import work from copies of a project. Here's how:
  1. Open a project and create the initial index as usual. This is the main project, where work from co-indexers will be imported, and where the formatted index will be exported.
  2. Send a copy of the project to each indexer (send the pdf and .tpro files).
  3. Co-indexers can edit the copied projects. Decide who will edit which section of the index, or which part of the text.
  4. When ready editing, co-indexers send the .tpro files back. Rename these, and place in a separate folder.
  5. In the main project use Project ► Import ► Import work from a project and select a returned project's .tpro file.
  6. After all work has been imported, the project can be further edited as needed, and the formatted index can be exported as usual.

Alternatively, you can have a set of chapters indexed separately and then merge the indexes.

Please note: The project owner will need a Pro license version. Co-indexers can use a Standard license version, or a trial license if they do not need to preview the full formatted index.


20. The page numbers are all off. How can I fix this?

In some cases automatic detection of the page numbering used in the text is not possible, and TExtract uses a default numbering, starting at 1. You need to verify and if necessary adjust the numbering before exporting the formatted index. In most cases adjustment takes less than a minute:
  1. In the editor choose Paging > Adjust page numbering.
  2. In the ContexView panel click a page number in one of the Arabic numbered pages, to highlight it in red.
  3. Press F5 to adjust the full page numbering.
  4. Navigate a few occurrences at the beginning and end of the text to verify that the numbering is now correct.
For details and options see menu item Paging > Help.


21. The index has to be ready by 5pm today. What can I do?

Follow the basic steps, paying attention to these points:
  1. In the Create Initial Index wizard, leave preferences at their standard settings.
  2. In IndexView, set the significance threshold (the digit shown in the toolbar) at 3 or 4, sort the list of index terms on the significance scores by clicking column caption 's'. This will place the interesting index terms at the top of the list.
  3. Run through IndexView, checking relevant index terms; do not worry about the number of references for an index term (you can set a maximum number of references when exporting the index); edit term texts if necessary
  4. After you've run through the entry list, if there is time left, re-sort the list on alphabet (click caption 'index terms'), lower the significance threshold by one or two stops and go back to step 3.
  5. Close the editor windows
  6. Press main menu item Export, press Preferences, press the References tab, set the include no more than [..] references per entry setting e.g. to 8 or 10. Press the Format tab if you want to adjust the output format. Press Ok, Ok.
  7. Review the exported index file - it will need final editing!
After exporting the index, you can always re-open the editor and do some more work, but note that when you re-export the index, work done in point 7 has to be repeated.


22. Not all references of an entry are shown - why?

When you edit the index, entries with a significance score below the significance threshold are hidden. To see such entries, lower the threshold (the digit shown in the toolbar in IndexView).

When previewing or exporting the formatted index, you can use the Preferences button to set options. Pay attention to these settings at the References tab page:

  • In the Reference range section, adjust the settings for From: and To:
  • Adjust the value for the maximum number of references per entry.


23. What are the keyword output options?

Next to creating indexes, TExtract can output keyword lists, e.g. to assign keywords to units of text. Listing keywords by page number is very useful when reviewing the coverage of an index. You can also list keywords per document.

To create a listing of keywords use one of the options under main menu item Export ► Keywords. You can output keywords by any unit of reference (i.e. page number, section, article number, document, etc.). These are the available options:

  • Use List by unit of reference to create a listing of keywords grouped by unit of reference. Set options as needed.
  • Use Simple list (edited) to create a simple keyword list. Only accepted entries will be included as keywords.
  • Use Simple list (automatic) to have keywords selected automatically. Set options as needed.
Note:
  • To create keywords using a controlled vocabulary, use an authority file.
Please note: keyword output is a Pro license feature.


24. Can TExtract use paragraph identifiers as locators?

It can. When setting up a project, by default TExtract scans the left margins of the PDF for the presence of paragraph IDs. (The right margins can optionally be included.) If found present, TExtract prompts whether the IDs should be used as locators. In most cases no further adjustment is needed.

For details, in the Help index see "paragraph identifiers."