Package org.htmlcleaner
Class TagNode
- java.lang.Object
-
- org.htmlcleaner.TagToken
-
- org.htmlcleaner.TagNode
-
- Direct Known Subclasses:
Serializer.HeadlessTagNode
public class TagNode extends TagToken implements HtmlNode
XML node tag - basic node of the cleaned HTML tree. At the same time, it represents start tag token after HTML parsing phase and before cleaning phase. After cleaning process, tree structure remains containing tag nodes (TagNode class), content (text nodes - ContentNode), comments (CommentNode) and optionally doctype node (DoctypeToken).
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static interface
TagNode.ITagNodeCondition
Used as base for different node checkers.class
TagNode.TagAllCondition
All nodes.class
TagNode.TagNodeAttExistsCondition
Checks if node contains specified attribute.class
TagNode.TagNodeAttValueCondition
Checks if node has specified attribute with specified value.class
TagNode.TagNodeNameCondition
Checks if node has specified name.
-
Field Summary
Fields Modifier and Type Field Description private java.util.Map<java.lang.String,java.lang.String>
attributes
private java.util.List
children
private DoctypeToken
docType
private boolean
isFormed
private java.util.List<BaseToken>
itemsToMove
private java.util.Map<java.lang.String,java.lang.String>
nsDeclarations
private TagNode
parent
-
Constructor Summary
Constructors Constructor Description TagNode(java.lang.String name)
-
Method Summary
All Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description void
addAttribute(java.lang.String attName, java.lang.String attValue)
Deprecated.Use setAttribute instead Adds specified attribute to this tag or overrides existing one.void
addChild(java.lang.Object child)
void
addChildren(java.util.List newChildren)
Add all elements from specified list to this node.(package private) void
addItemForMoving(BaseToken item)
void
addNamespaceDeclaration(java.lang.String nsPrefix, java.lang.String nsURI)
Adds namespace declaration to the node(package private) void
collectNamespacePrefixesOnPath(java.util.Set<java.lang.String> prefixes)
Collect all prefixes in namespace declarations up the path to the document root from the specified nodejava.lang.Object[]
evaluateXPath(java.lang.String xPathExpression)
Evaluates XPath expression on give node.private TagNode
findElement(TagNode.ITagNodeCondition condition, boolean isRecursive)
Finds first element in the tree that satisfy specified condition.TagNode
findElementByAttValue(java.lang.String attName, java.lang.String attValue, boolean isRecursive, boolean isCaseSensitive)
TagNode
findElementByName(java.lang.String findName, boolean isRecursive)
TagNode
findElementHavingAttribute(java.lang.String attName, boolean isRecursive)
TagNode[]
getAllElements(boolean isRecursive)
java.util.List
getAllElementsList(boolean isRecursive)
java.lang.String
getAttributeByName(java.lang.String attName)
java.util.Map<java.lang.String,java.lang.String>
getAttributes()
int
getChildIndex(HtmlNode child)
java.util.List
getChildren()
java.util.List
getChildTagList()
TagNode[]
getChildTags()
DoctypeToken
getDocType()
private java.util.List
getElementList(TagNode.ITagNodeCondition condition, boolean isRecursive)
Get all elements in the tree that satisfy specified condition.java.util.List
getElementListByAttValue(java.lang.String attName, java.lang.String attValue, boolean isRecursive, boolean isCaseSensitive)
java.util.List
getElementListByName(java.lang.String findName, boolean isRecursive)
java.util.List
getElementListHavingAttribute(java.lang.String attName, boolean isRecursive)
private TagNode[]
getElements(TagNode.ITagNodeCondition condition, boolean isRecursive)
TagNode[]
getElementsByAttValue(java.lang.String attName, java.lang.String attValue, boolean isRecursive, boolean isCaseSensitive)
TagNode[]
getElementsByName(java.lang.String findName, boolean isRecursive)
TagNode[]
getElementsHavingAttribute(java.lang.String attName, boolean isRecursive)
(package private) java.util.List<BaseToken>
getItemsToMove()
java.util.Map<java.lang.String,java.lang.String>
getNamespaceDeclarations()
(package private) java.lang.String
getNamespaceURIOnPath(java.lang.String nsPrefix)
TagNode
getParent()
java.lang.StringBuffer
getText()
boolean
hasAttribute(java.lang.String attName)
Checks existance of specified attribute.boolean
hasChildren()
void
insertChild(int index, HtmlNode childToAdd)
Inserts specified node at specified position in array of childrenvoid
insertChildAfter(HtmlNode node, HtmlNode nodeToInsert)
Inserts specified node in the list of children after specified childvoid
insertChildBefore(HtmlNode node, HtmlNode nodeToInsert)
Inserts specified node in the list of children before specified child(package private) boolean
isFormed()
(package private) TagNode
makeCopy()
void
removeAllChildren()
Removes all children (subelements and text content).void
removeAttribute(java.lang.String attName)
Removes specified attribute from this tag.boolean
removeChild(java.lang.Object child)
Remove specified child element from this node.boolean
removeFromTree()
Remove this node from the tree.void
replaceChild(HtmlNode childToReplace, HtmlNode replacement)
Replaces specified child node with specified replacement node.void
serialize(Serializer serializer, java.io.Writer writer)
void
setAttribute(java.lang.String attName, java.lang.String attValue)
Adding new attribute ir overriding existing one.(package private) void
setChildren(java.util.List children)
void
setDocType(DoctypeToken docType)
(package private) void
setFormed()
(package private) void
setFormed(boolean isFormed)
(package private) void
setItemsToMove(java.util.List<BaseToken> itemsToMove)
boolean
setName(java.lang.String name)
Changes name of the tag(package private) void
transformAttributes(TagTransformation tagTrans)
void
traverse(TagNodeVisitor visitor)
Traverses the tree and performs visitor's action on each node.private boolean
traverseInternally(TagNodeVisitor visitor)
-
-
-
Field Detail
-
parent
private TagNode parent
-
attributes
private java.util.Map<java.lang.String,java.lang.String> attributes
-
children
private java.util.List children
-
docType
private DoctypeToken docType
-
nsDeclarations
private java.util.Map<java.lang.String,java.lang.String> nsDeclarations
-
itemsToMove
private java.util.List<BaseToken> itemsToMove
-
isFormed
private transient boolean isFormed
-
-
Method Detail
-
setName
public boolean setName(java.lang.String name)
Changes name of the tag- Parameters:
name
-- Returns:
- True if new name is valid, false otherwise
-
getAttributeByName
public java.lang.String getAttributeByName(java.lang.String attName)
- Parameters:
attName
-- Returns:
- Value of the specified attribute, or null if it this tag doesn't contain it.
-
getAttributes
public java.util.Map<java.lang.String,java.lang.String> getAttributes()
- Returns:
- Map instance containing all attribute name/value pairs.
-
hasAttribute
public boolean hasAttribute(java.lang.String attName)
Checks existance of specified attribute.- Parameters:
attName
-
-
addAttribute
@Deprecated public void addAttribute(java.lang.String attName, java.lang.String attValue)
Deprecated.Use setAttribute instead Adds specified attribute to this tag or overrides existing one.- Parameters:
attName
-attValue
-
-
setAttribute
public void setAttribute(java.lang.String attName, java.lang.String attValue)
Adding new attribute ir overriding existing one.- Specified by:
setAttribute
in classTagToken
- Parameters:
attName
-attValue
-
-
addNamespaceDeclaration
public void addNamespaceDeclaration(java.lang.String nsPrefix, java.lang.String nsURI)
Adds namespace declaration to the node- Parameters:
nsPrefix
- Namespace prefixnsURI
- Namespace URI
-
getNamespaceDeclarations
public java.util.Map<java.lang.String,java.lang.String> getNamespaceDeclarations()
- Returns:
- Map of namespace declarations for this node
-
removeAttribute
public void removeAttribute(java.lang.String attName)
Removes specified attribute from this tag.- Parameters:
attName
-
-
getChildren
public java.util.List getChildren()
- Returns:
- List of children objects. During the cleanup process there could be different kind of childern inside, however after clean there should be only TagNode instances.
-
hasChildren
public boolean hasChildren()
- Returns:
- Whether this node has child elements or not.
-
setChildren
void setChildren(java.util.List children)
-
getChildTagList
public java.util.List getChildTagList()
-
getChildTags
public TagNode[] getChildTags()
- Returns:
- An array of child TagNode instances.
-
getText
public java.lang.StringBuffer getText()
- Returns:
- Text content of this node and it's subelements.
-
getParent
public TagNode getParent()
- Returns:
- Parent of this node, or null if this is the root node.
-
getDocType
public DoctypeToken getDocType()
-
setDocType
public void setDocType(DoctypeToken docType)
-
addChild
public void addChild(java.lang.Object child)
-
addChildren
public void addChildren(java.util.List newChildren)
Add all elements from specified list to this node.- Parameters:
newChildren
-
-
findElement
private TagNode findElement(TagNode.ITagNodeCondition condition, boolean isRecursive)
Finds first element in the tree that satisfy specified condition.- Parameters:
condition
-isRecursive
-- Returns:
- First TagNode found, or null if no such elements.
-
getElementList
private java.util.List getElementList(TagNode.ITagNodeCondition condition, boolean isRecursive)
Get all elements in the tree that satisfy specified condition.- Parameters:
condition
-isRecursive
-- Returns:
- List of TagNode instances with specified name.
-
getElements
private TagNode[] getElements(TagNode.ITagNodeCondition condition, boolean isRecursive)
- Parameters:
condition
-isRecursive
-- Returns:
- The array of all subelemets that satisfy specified condition.
-
getAllElementsList
public java.util.List getAllElementsList(boolean isRecursive)
-
getAllElements
public TagNode[] getAllElements(boolean isRecursive)
-
findElementByName
public TagNode findElementByName(java.lang.String findName, boolean isRecursive)
-
getElementListByName
public java.util.List getElementListByName(java.lang.String findName, boolean isRecursive)
-
getElementsByName
public TagNode[] getElementsByName(java.lang.String findName, boolean isRecursive)
-
findElementHavingAttribute
public TagNode findElementHavingAttribute(java.lang.String attName, boolean isRecursive)
-
getElementListHavingAttribute
public java.util.List getElementListHavingAttribute(java.lang.String attName, boolean isRecursive)
-
getElementsHavingAttribute
public TagNode[] getElementsHavingAttribute(java.lang.String attName, boolean isRecursive)
-
findElementByAttValue
public TagNode findElementByAttValue(java.lang.String attName, java.lang.String attValue, boolean isRecursive, boolean isCaseSensitive)
-
getElementListByAttValue
public java.util.List getElementListByAttValue(java.lang.String attName, java.lang.String attValue, boolean isRecursive, boolean isCaseSensitive)
-
getElementsByAttValue
public TagNode[] getElementsByAttValue(java.lang.String attName, java.lang.String attValue, boolean isRecursive, boolean isCaseSensitive)
-
evaluateXPath
public java.lang.Object[] evaluateXPath(java.lang.String xPathExpression) throws XPatherException
Evaluates XPath expression on give node.
This is not fully supported XPath parser and evaluator. Examples below show supported elements:- //div//a
- //div//a[@id][@class]
- /body/*[1]/@type
- //div[3]//a[@id][@href='r/n4']
- //div[last() >= 4]//./div[position() = last()])[position() > 22]//li[2]//a
- //div[2]/@*[2]
- data(//div//a[@id][@class])
- //p/last()
- //body//div[3][@class]//span[12.2
- data(//a['v' < @id])
- Parameters:
xPathExpression
-- Returns:
- Throws:
XPatherException
-
removeFromTree
public boolean removeFromTree()
Remove this node from the tree.- Returns:
- True if element is removed (if it is not root node).
-
removeChild
public boolean removeChild(java.lang.Object child)
Remove specified child element from this node.- Parameters:
child
-- Returns:
- True if child object existed in the children list.
-
removeAllChildren
public void removeAllChildren()
Removes all children (subelements and text content).
-
replaceChild
public void replaceChild(HtmlNode childToReplace, HtmlNode replacement)
Replaces specified child node with specified replacement node.- Parameters:
childToReplace
- Child node to be replacedreplacement
- Replacement node
-
getChildIndex
public int getChildIndex(HtmlNode child)
- Parameters:
child
- Child to find index of- Returns:
- Index of the specified child node inside this node's children, -1 if node is not the child
-
insertChild
public void insertChild(int index, HtmlNode childToAdd)
Inserts specified node at specified position in array of children- Parameters:
index
-childToAdd
-
-
insertChildBefore
public void insertChildBefore(HtmlNode node, HtmlNode nodeToInsert)
Inserts specified node in the list of children before specified child- Parameters:
node
- Child before which to insert new nodenodeToInsert
- Node to be inserted at specified position
-
insertChildAfter
public void insertChildAfter(HtmlNode node, HtmlNode nodeToInsert)
Inserts specified node in the list of children after specified child- Parameters:
node
- Child after which to insert new nodenodeToInsert
- Node to be inserted at specified position
-
addItemForMoving
void addItemForMoving(BaseToken item)
-
getItemsToMove
java.util.List<BaseToken> getItemsToMove()
-
setItemsToMove
void setItemsToMove(java.util.List<BaseToken> itemsToMove)
-
isFormed
boolean isFormed()
-
setFormed
void setFormed(boolean isFormed)
-
setFormed
void setFormed()
-
transformAttributes
void transformAttributes(TagTransformation tagTrans)
-
traverse
public void traverse(TagNodeVisitor visitor)
Traverses the tree and performs visitor's action on each node. It stops when it finishes all the tree or when visitor returns false.- Parameters:
visitor
- TagNodeVisitor implementation
-
traverseInternally
private boolean traverseInternally(TagNodeVisitor visitor)
-
collectNamespacePrefixesOnPath
void collectNamespacePrefixesOnPath(java.util.Set<java.lang.String> prefixes)
Collect all prefixes in namespace declarations up the path to the document root from the specified node- Parameters:
prefixes
- Set of prefixes to be collected
-
getNamespaceURIOnPath
java.lang.String getNamespaceURIOnPath(java.lang.String nsPrefix)
-
serialize
public void serialize(Serializer serializer, java.io.Writer writer) throws java.io.IOException
-
makeCopy
TagNode makeCopy()
-
-