com.lowagie.text.pdf.parser
Class SimpleTextExtractingPdfContentRenderListener

java.lang.Object
  extended by com.lowagie.text.pdf.parser.SimpleTextExtractingPdfContentRenderListener
All Implemented Interfaces:
RenderListener, TextProvidingRenderListener

public class SimpleTextExtractingPdfContentRenderListener
extends Object
implements TextProvidingRenderListener

A simple text extraction renderer. This renderer keeps track of the current Y position of each string. If it detects that the y position has changed, it inserts a line break into the output. If the PDF renders text in a non-top-to-bottom fashion, this will result in the text not being a true representation of how it appears in the PDF. This renderer also uses a simple strategy based on the font metrics to determine if a blank space should be inserted into the output.

Since:
2.1.5

Field Summary
private  Vector lastEnd
           
private  float lastEndingXPos
          keeps track of the X position of the end of the last rendered text
private  Vector lastStart
           
private  Matrix lastTextLineMatrix
           
private  float lastYPos
          keeps track of the Y position of the last rendered text
private  StringBuffer result
          used to store the resulting String.
 
Constructor Summary
SimpleTextExtractingPdfContentRenderListener()
          Creates a new text extraction renderer.
 
Method Summary
 String getResultantText()
          Returns the result so far.
 void renderText(TextRenderInfo renderInfo)
          Captures text using a simplified algorithm for inserting hard returns and spaces
 void reset()
          Resets the internal state of the RenderListener
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

lastYPos

private float lastYPos
keeps track of the Y position of the last rendered text


lastEndingXPos

private float lastEndingXPos
keeps track of the X position of the end of the last rendered text


lastTextLineMatrix

private Matrix lastTextLineMatrix

lastStart

private Vector lastStart

lastEnd

private Vector lastEnd

result

private StringBuffer result
used to store the resulting String.

Constructor Detail

SimpleTextExtractingPdfContentRenderListener

public SimpleTextExtractingPdfContentRenderListener()
Creates a new text extraction renderer.

Method Detail

reset

public void reset()
Description copied from interface: RenderListener
Resets the internal state of the RenderListener

Specified by:
reset in interface RenderListener

getResultantText

public String getResultantText()
Returns the result so far.

Specified by:
getResultantText in interface TextProvidingRenderListener
Returns:
a String with the resulting text.

renderText

public void renderText(TextRenderInfo renderInfo)
Captures text using a simplified algorithm for inserting hard returns and spaces

Specified by:
renderText in interface RenderListener
Parameters:
renderInfo - information specifying what to render
See Also:
com.lowagie.text.pdf.parser.AbstractRenderListener#renderText(java.lang.String, com.lowagie.text.pdf.parser.GraphicsState, com.lowagie.text.pdf.parser.Matrix, com.lowagie.text.pdf.parser.Matrix)

Hosted by Hostbasket