Xpath with Puppeteer

Table of content

What is XPath
Syntax of XPath
Xpath with Tagname
Xpath with Index
Xpath with Attribute
Xpath With Parent Reference
Xpath with Group Index
text() function in Xpath
Wild card Character with XPath in Puppeteer
Dependent and Independent Xpath
contains() function in Xpath
Normalize Space in Xpath
starts-with function xpath
ends-with func xpath
Last() function in Xpath
Position function in Xpath

What is XPath

Xpath is nothing but XML path, the developer used XPath to validate XML files. HTML also follows the same structures as XML, so we can apply XPath to HTML pages as well along with Puppeteer.

Hope reader is familiar with TryXpath and developer tools of browser which we use for inspecting elements and verify our xpaths

Xpath is nothing but sting expression which used to find the element(s) along with puppeteer, and other automation tools

We should give last priority to XPath among locators because Xpath is little slow compared with other locators, if we are not able to find the element with id, name, linktext, css then only we should go for xpath

Puppeteer supports xpath 1.0 and xpath 2.0, 3.0 are not compatible. In this tutorial, we are going to learn how to build xpaths and verify xpaths

Syntax of XPath

Xpath follows very simple syntax, please find below image for the XPath syntax

XPath Syntax
xpath-selenium-selenide
HTML code Syntax
html-code-xpath-selenium

The HTML code can have n-number of attributes, Text and closing tag is not mandatory for a few elements

There are two kinds of xpaths

Absolute xpath
Relative xpath

Absolute Xpath
absolute-xpath-selenium-selenide

/ - point to first node on the html document, it is html tag
Note: we are not going to focus on absolute xpath.

Relative Xpath
relative-xpath-selenium-selenide // - points to any node in the webpage

tagName - tag name is nothing but the name which is present after the < (angular bracket)

attribute - whatever is present inside < and > bracket except tagname is an attribute, any number of attributes can present in html code

attribute's value - it is corresponding value to the attribute, sometimes for boolean attribute developers may not specify any value, in those cases, html takes 'true' as default value.

Text - text is the value present inside > and <

Now let form the xpath for the above HTML code.
//a[@class='idle']

Xpath with Tagname

We can write Xpath based on Tagname, which is very simple.

Syntax for Xpath with Tagname : //tagName

<html>
	<body>
		<div id="pancakes">
			<button type="button">Blueberry</button><br><br>
		</div>
	</body>
</html>

In the above code, there is a button present under div. we can write the xpath with tagname : //button

Xpath with Index

We may not see unique elements in the webpage, other than on login page.

Please save the below HTML file as composite-xpath.html on your local machine

<html>
	<body>
		<div id="pancakes">
			<button type="button">Blueberry</button><br><br>
			<button type="button">Banana</button><br><br>
			<button type="button">Strawberry</button><br><br>
		</div>
	</body>
</html>

Open above HTML file in chrome, and press F12 or right click on an element and choose Inspect Element or Press Ctrl+Shift+I

It may look like the below image once you open the chrome developer tool
composite-xpath-selenium

Press Ctrl+F to verify Xpath, and write the XPath based on the XPath syntax.

Xpath based on the Tagname : //button 3-matches-composite-xpath

When you try the XPath with tagname it shows three matches, so we cannot proceed as we want to find only one match. We must write an XPath expression which should have only one match.

When we have a matching element only under one parent(this case), we should add an index to the XPath

Syntax for Xpath with Index : //tagName[index]

index must be covered with square('[',']') brackets. Index starts from 1 in xpath index-xpath-selenium Xpath for the elements :
Bluberry button- //button[1]
Banana button - //button[2]
Strawberry button -//button[3]

Xpath with Attribute

We can use index type XPath with puppeteer when we have more matches under one parent, the index might not work if there are more parent

Store below HTML in the local system and open it with chrome

<html>
	<body>
		<div id="pancakes">
			<button type="button">Blueberry</button><br><br>
			<button type="button" name='banana' >Banana</button><br><br>
			<button type="button">Strawberry</button><br><br>
		</div>
		<div id="pancakes">
			<button type="button">Apple</button><br><br>
			<button type="button">Orange</button><br><br>
			<button type="button">Grape</button><br><br>
		</div>
	</body>
</html>

Let's try to write XPath for Banana button, Xpath based on an index is //button[2] but it has two matches 1. Banana, 2.Orange.

With index, we may not be able to solve this issue.

Let's consider other properties of the HTML element, banana has attribute name, Now we have to form the XPath based on the attribute.

Xpath with Attribute ://tagName[@attribute='attribute value']

Xpath based on the Attribute is : //button[@name='banana'] , this XPath shows only one match which is Banana button

You can add n number attributes in one XPath itself

Xpath with multiple Attributes://tagName[@attrib='attrib value'][@attrib2='attrib2 value']...

Can I use index along with attribute: yes, you can use, but the index will be useful only when matches are under a single parent

Xpath with Attribute and Index://tagName[@attribute='attribute value'][index]

Xpath With Parent Reference

We cannot expect an HTML element to have different or unique properties all the time, sometimes there is a chance that every element may have the same kind of attributes, In those cases, we cannot use XPath with Attribute in puppeteer

To handle such kind of cases we may need to take help of the parent element to find our actual element

Store the below code in HTML file and open it in chrome

<html>
	<body>
		<div id="berry">
			<button type="button">Blueberry</button><br><br>
			<button type="button">Banana</button><br><br>
			<button type="button">Strawberry</button><br><br>
		</div>
		<div id="fruit">
			<button type="button">Apple</button><br><br>
			<button type="button">Orange</button><br><br>
			<button type="button">Grape</button><br><br>
		</div>
	</body>
</html>

Let's write XPath for Orange, using parent and child concept

The syntax for XPath with parent and child parent-child-xpath

For Orange element we have to refer it parent div which has id attribute as fruit Xpath for the Orange: //div[@id='fruit']/button[2] parent-child-xpath-selenium We have only one match for the xpath we have written.

The explanation for Xpath : //div[@id='fruit']/button[2]
// - look for any node which has 'div' as tagname and id as fruit, look for immediate child(/) node which has tagname as button and at the index of 2.

Xpath with Group Index

Sometimes we may have to handle the elements with XPath index but the index may give more than one match, which are under different parents, in these situations index might not help you. We may have to use Group index in these kinds of scenarios

Group index puts all matches into a list and gives indexes them. So here we will not have any duplicates matches

Syntax : (//tagName)[index]

We have to use parenthesis to make an xpath into group XPath after it indexes the xpath

Store below HTML code into html file :

<html>
	<body>
		<div id="fruit"><br><br><br>
			<button type="button">Blueberry</button><br><br>
			<button type="button"  >Banana</button><br><br>
			<button type="button">Strawberry</button><br><br>
		</div>
		<div id="fruit">
			<button type="button">Apple</button><br><br>
			<button type="button" >Orange</button><br><br>
			<button type="button">Grape</button><br><br>
		</div>
	</body>
</html>

Let's write XPath for Orange : (//button)[5]
group-index-xpath-selenium-webdriver

text() function in Xpath

There will be situations, where you may not able to use any HTML property rather than text present in the element

text() function helps us to find the element based on the text present in the element, text() function is case sensitive

<button type="button">Blueberry</button><br><br>

In the above code, the text is Blueberry, and we can write XPath using text() like below

xpath with text : //button[text()='Bluberry']

Note: we use @ sign for attributes, functions do not need @ sign

We can also match element(s) which have text in them with below XPath

xpath with text ://button[text()]
                 ://button/text()

Wild card Character with XPath in Puppeteer

* -is the one of most used wild card character with XPath in puppeteer, we can use it instead of the tag name and attribute

//* - matches all the elements present in the HTML (including HTML)

//div/* - matches all the immediate element(s) inside the div tag

//input[@*] - matches all the element(s) with input tag and have at least one attribute, the attribute value may or may not present

//*[@*] - matches all the element(s) which have at least one attribute.

Dependent and Independent Xpath

We may face scenarios where the given element may change it's position every time, so handle such kind of scenarios we have to go for dependent and independent xpaths

For example: Take any e-commerce website, search for a specific product and write XPath for that particular product, take rest for few days and then go and search for the same product, there could be a change in position of the product, to handle this we should use the dependent and independent concept

Scenario: Select the checkbox which is present in the same row as WebdriverIO

Select	Tool	Language
	Selenium	Java
	WebdriverIO	Typescript
	Selenium Bindings	Python
	QTP	VB

Steps to solve the scenario:

Do not write the XPath for the checkbox, because checkboxes might change its position.
Based on the text present in the WebdriverIO field, we have to write the XPath
We have to find the common parent for WebdriverIO and Checkbox
Xpath to find the webdriverio : //td[text()='WebdriverIO']
Now we should find the parent of WebdriverIO element
We can find the parent of an element using /.. like in unix
Xpath for parent of WebdriverIO : //td[text()='WebdriverIO']/..
Check WebdriverIO's parent is a common parent for WebdriverIO and checkbox.
Yes, WebdriverIO parent is a common parent for WebdriverIO and checkbox
Now try to navigate to checkbox using checkbox properties
Checkbox has tagname as input : //td[text()='WebdriverIO']/..//input
Try the above xpath it will highlight the checkbox related to WebdriverIO field

Here WebdriverIO is independent and the checkbox is dependent
Independent: It does not depend on any other element
Dependent: We have to find this based on the other element(Independent)

contains() function in Xpath

contains() function helps the user to find the element with partial values, or dynamically changing values, contains verifies matches with a portion of the value

contains function ://xpath[contains(@attribute, 'attribute value')]
						//xpath[contains(@text(), 'attribute value')]

Example of below html:

<html>
<body>
	<div id="fruit"><br><br><br><br><br><br><br><br><br><br><br>
		<button type="button">Blue berry1234</button><br><br>
		<button type="button" >Banana</button><br><br>
		<button type="button">Straw</button><br><br>
		<button type="button">berry</button><br><br>
		<button type="button">Straw berry</button><br><br>
		</div>
	</body>
</html>

Xpath for the Blueberry : //button[contains(text(),'Blue')]
Xpath for the Banana : //button[contains(text(),'Ban')]

More Complex items:
In the same way, if you try to find xpath for Straw berry with //button[contains(text(),'Straw')] it finds the element with text Straw as well.

If you try with berry you may get 'berry' element. So how to find the Straw berry button.

We can combine more than one contains functions like : //xpath[contains(text(), 'text1')][contains(text(), 'text2')]
Xpath for Strawberry is : //button[contains(text(),'Straw')][contains(text(), 'berry')]

Not only for the text you can apply contains a function for other properties as well Eg : //button[contains(@type,'but')]

Normalize Space in Xpath

Normalize space matches the element ignoring starting and ending spaces

syntax ://xpath[normalize-space(property)='value']

<button type="button">   Strawberry   </button><br><br>

Xpath for the Strawberry element : //button[normalize-space(text())= 'Strawberry']

starts-with function xpath

Starts-with function matches the elements which property starting value

syntax ://xpath[starts-with(@attribute,'starting value')]

<button type="button">Straw berry</button><br><br>

Xpath for the Strawberry element : //button[starts-with(text(), 'Straw')]

ends-with func xpath

Ends-with function matches the elements properties ending value

syntax ://xpath[ends-with(@attribute,'ending value')]

<button type="button">Straw berry</button><br><br>

Xpath for the Strawberry element : //button[ends-with(text(), 'berry')]

Last() function in Xpath

By default automation tools takes the first instance of the match, also if we want to achieve the first element we can use index [1]. But in some pages we may not be able to see how many matches are present when the page is loading or on the dynamic page.

last() function in Xpath helps the user to find the last match of the element.

last function : //xpath[last()]

take an example of below html code

<html>
	<body>
		<div id="fruit"><br><br><br>
			<button type="button">Blueberry</button><br><br>
			<button type="button"  >Banana</button><br><br>
			<button type="button">Strawberry</button><br><br>
		</div>
	</body>
</html>

In above if we want to write xpath for the last element it is easy we can say use index [3], but if the application is very large or dynamic, we cannot say how may elements are gonna present.

So let's use the last function in xpath : //button[last()] - points to the Strawberry button.
last-xpath-selenium

Position function in Xpath

Position function helps the user to get the match at a particular index, using the position we can get element which is less than the position or greater than the position as well.

position function ://xpath[position()=2]
						://xpath[position()<2]
						 ://xpath[position()>2]
						 ://xpath[position()=<2]]  ...

Example : //button[position()=2] position-xpath-selenium

CaSe in-sensitive Xpath in Puppeteer

Sometimes we may have a situation where we have to find the element based on the attribute. We can use @ method for the attribute but if the attribute values change every time lower to upper case or mix case value when page refreshes, in this @ method may not help us.

During such kind of situations, we must ignore the case(UPPER/lower). Below is the syntax to match the elements by ignoring case, translate method helps us to perform this.

Syntax :<s