org.apache.solr.handler.dataimport
Class EntityProcessorWrapper

java.lang.Object
  extended by org.apache.solr.handler.dataimport.EntityProcessor
      extended by org.apache.solr.handler.dataimport.EntityProcessorWrapper
Direct Known Subclasses:
ThreadedEntityProcessorWrapper

public class EntityProcessorWrapper
extends EntityProcessor

A Wrapper over EntityProcessor instance which performs transforms and handles multi-row outputs correctly.

Since:
solr 1.4
Version:
$Id: EntityProcessorWrapper.java 1304024 2012-03-22 20:15:45Z jdyer $

Field Summary
protected  VariableResolverImpl resolver
           
protected  List<Map<String,Object>> rowcache
           
protected  List<Transformer> transformers
           
 
Constructor Summary
EntityProcessorWrapper(EntityProcessor delegate, DocBuilder docBuilder)
           
 
Method Summary
protected  Map<String,Object> applyTransformer(Map<String,Object> row)
          handles null-on-null invocation call transformers via transformRow(java.util.Map), then assigns the result into row cache, and delegates to getFromRowCache()
 void close()
          Invoked when the Entity processor is destroyed towards the end of import.
 void destroy()
          Invoked for each parent-row after the last row for this entity is processed.
 Context getContext()
           
protected  Map<String,Object> getFromRowCache()
           
 VariableResolverImpl getVariableResolver()
           
 void init(Context context)
          This method is called when it starts processing an entity.
 Map<String,Object> nextDeletedRowKey()
          This is used during delta-import.
 Map<String,Object> nextModifiedParentRowKey()
          This is used during delta-import.
 Map<String,Object> nextModifiedRowKey()
          This is used for delta-import.
 Map<String,Object> nextRow()
          for root entity it retrieves single row, transforms it, and loop until transfomer passes the first row for child entities whole page is pulled.
protected  Map<String,Object> pullRow()
          pulls single row from the delegate EntityProcessor.
protected  List<Map<String,Object>> transformRow(Map<String,Object> row)
          Initialises transformers, applies them on the given row.
 
Methods inherited from class org.apache.solr.handler.dataimport.EntityProcessor
postTransform
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

resolver

protected VariableResolverImpl resolver

transformers

protected List<Transformer> transformers

rowcache

protected List<Map<String,Object>> rowcache
Constructor Detail

EntityProcessorWrapper

public EntityProcessorWrapper(EntityProcessor delegate,
                              DocBuilder docBuilder)
Method Detail

init

public void init(Context context)
Description copied from class: EntityProcessor
This method is called when it starts processing an entity. When it comes back to the entity it is called again. So it can reset anything at that point. For a rootmost entity this is called only once for an ingestion. For sub-entities , this is called multiple once for each row from its parent entity

Specified by:
init in class EntityProcessor
Parameters:
context - The current context

getFromRowCache

protected Map<String,Object> getFromRowCache()

applyTransformer

protected Map<String,Object> applyTransformer(Map<String,Object> row)
handles null-on-null invocation call transformers via transformRow(java.util.Map), then assigns the result into row cache, and delegates to getFromRowCache()


transformRow

protected List<Map<String,Object>> transformRow(Map<String,Object> row)
Initialises transformers, applies them on the given row. returned collection is mutable

Returns:
several rows emitted by transformers. if there are no transformers, returns single element list contains the given row; if transformer returns null, returns empty collection.

nextRow

public Map<String,Object> nextRow()
for root entity it retrieves single row, transforms it, and loop until transfomer passes the first row for child entities whole page is pulled. where the page is non-null children entity rows. then the whole page is transformed and emitted to the row cache. the rationale is avoid stealing child rows by parent entity threads. For every parent row the linked children rows (page) is pulled under lock obtained on delegate EntityProcessor

Specified by:
nextRow in class EntityProcessor
Returns:
A 'row'. The 'key' for the map is the column name and the 'value' is the value of that column. If there are no more rows to be returned, return 'null'

pullRow

protected Map<String,Object> pullRow()
pulls single row from the delegate EntityProcessor. it expect to be called in synchronized(delegate) section

Returns:
row from delegate

nextModifiedRowKey

public Map<String,Object> nextModifiedRowKey()
Description copied from class: EntityProcessor
This is used for delta-import. It gives the pks of the changed rows in this entity

Specified by:
nextModifiedRowKey in class EntityProcessor
Returns:
the pk vs value of all changed rows

nextDeletedRowKey

public Map<String,Object> nextDeletedRowKey()
Description copied from class: EntityProcessor
This is used during delta-import. It gives the primary keys of the rows that are deleted from this entity. If this entity is the root entity, solr document is deleted. If this is a sub-entity, the Solr document is considered as 'changed' and will be recreated

Specified by:
nextDeletedRowKey in class EntityProcessor
Returns:
the pk vs value of all changed rows

nextModifiedParentRowKey

public Map<String,Object> nextModifiedParentRowKey()
Description copied from class: EntityProcessor
This is used during delta-import. This gives the primary keys and their values of all the rows changed in a parent entity due to changes in this entity.

Specified by:
nextModifiedParentRowKey in class EntityProcessor
Returns:
the pk vs value of all changed rows in the parent entity

destroy

public void destroy()
Description copied from class: EntityProcessor
Invoked for each parent-row after the last row for this entity is processed. If this is the root-most entity, it will be called only once in the import, at the very end.

Specified by:
destroy in class EntityProcessor

getVariableResolver

public VariableResolverImpl getVariableResolver()

getContext

public Context getContext()

close

public void close()
Description copied from class: EntityProcessor
Invoked when the Entity processor is destroyed towards the end of import.

Overrides:
close in class EntityProcessor