Skip to main content

Read & Write Data from Excel in Selenium: Apache POI

 Selenium WebDriver is the most used automation tool for the automation of web applications. Now, we know that these web applications are used by multiple users, and each one of those uses the applications as per their own data. So, considering the usage, it becomes the primary responsibility of the QAs also to test the web applications with varying data sets. Now the user journeys will be the same, but the data set will be different. Therefore, it makes more sense to execute the same test case with different data, instead of writing a separate test case for each user journey with each data set. This is where Microsoft Excel comes in handy, which is one of the favorite tools for storing test data. Excel in Selenium is one of the most used combinations for storing test data and then running the same test case against various data sets.

There are various libraries in JAVA which helps in reading/writing data from Excel files. But, Apache POI is one of the most used libraries, which provides various classes and methods to read/write data from various formats of Excel files(xls, xlsx etc). Subsequently, in this article, we will understand the details of Apache POI and how we can use the same to read/write data from Excel files, by covering the details under the following topics:

  • What is Apache POI?
    • How to manage Excel workbooks?
    • How to manage Excel sheets?
    • Also, how to manage Excel rows?
    • How to manage Excel cells? 
  • How to read data from Excel in Selenium tests using Apache POI?
    • Additionally, how to read a specific cell value?
    • How to read the entire Excel sheet?
  • How to write data in Excel in Selenium tests using Apache POI?
    • Moreover, how to write to a new cell in an existing row? 
    • And, how to write to a new cell in a new row?

 

 

What is Apache POI?

Apache POI,  where POI stands for (Poor Obfuscation Implementation)  is an API that offers a collection of Java libraries that helps us to read, write, and manipulate different Microsoft files such as excel sheets, power-point, and word files.

apache poi in selenium

 

Apache POI uses certain terms to work with Microsoft Excel. Let’s get familiar with these before we go into the details of the code.

TermDetails
Workbookworkbook represents a Microsoft Excel file. It can be used for creating and maintaining the spreadsheet. A workbook may contain many sheets.
Sheetsheet refers to a page in a Microsoft Excel file that contains the number of rows and columns.
RowA row represents a collection of cells, which is used to represent a row in the spreadsheet.
Cellcell is indicated by a row and column combination. Data entered by a user is stored in a cell. Data can be of the type such as string, numeric value, or formula.

Before we start, the first step is to download the jar files required to use the library. You can download the Apache POI library by referring to Steps to Download Apache POI.

The below image clearly depicts the structure and how the classes and interfaces are aligned in Apache POI.

apache poi classes and interfaces

 

Let’s now understand how we can access and manage various components in an Excel file using Apache POI ?

 

 

How to manage Excel workbooks pragmatically?

Apache POI provides various interfaces and classes that help us to work with Excel. It provides a “Workbook ” interface to maintain Excel Workbooks. There are certain classes that implement this interface and we use these classes to create, modify, read, and write data in Excel files. The two mainly used  classes for managing Excel Workbooks are:

  • HSSFWorkbook- These class methods are used to read/write data to Microsoft Excel file in .xls format. It is compatible with MS-Office versions 97–2003.
  • XSSFWorkbook- These class methods are used to read-write data to Microsoft Excel in .xls or .xlsx format. It is compatible with MS-Office versions 2007 or later.

 

How to manage Excel sheets programmatically?

There is another interface, “Sheet “, which we use to create a sheet in the Workbook. There are two classes that used to work with sheets, same as we have for Workbook Interface:

  • HSSFSheet – This class is used to create a new sheet in the HSSFWorkbook, ie, the older format of Excel.
  • XSSFSheet – This class is used to create a new sheet in the XSSFWorkbook., ie, the new format of Excel

 

How to manage Excel rows pragmatically?

The Row interface provides us with the ability to work with rows in the Excel sheet. Below two classes implement this interface:

  • HSSFRow – This represents a row in the HSSFSheet.
  • XSSFRow – This represents a row in the XSSFSheet.

 

How to manage Excel cells? 

The Cell interface helps us in accessing the cells of a particular row. There are two classes that implement this interface and we can use for reading/writing data into the cell:

  • HSSFCell – We use it to work with cells of HSSFRow.
  • XSSFCell – We use it to work with cells of XSSFRow.

Now that we have gone through the details of the Apache POI library, let’s try to use it to read and write into Excel Files using Selenium WebDriver.

 

 

How to read data from Excel in Selenium tests using Apache POI?

Suppose, for a Selenium test case, we need to read the student data from the Excel Sheet, having a sample data as shown below:

sample excel file

 

While reading the Excel file, Apache POI can read data in two ways:

  • You want to read the value of a particular cell, for instance, you want to get the address of the student present in the second row.
  • You can read the entire excel in one go. It is based upon the need for your test script and the data needed for test execution.

 

We will understand both the ways of reading the excel in Selenium, but before that, there are some common steps to follow:

1. The first step is to obtain the Excel Workbook based upon its location on the computer. You can create an object of the workbook by referring to the FileInputStream object that points to the excel file. We can do it as shown below using the HSSFWorkbook Class. If you are using MS-Office versions 97–2003 or XSSFWorkbook Class if MS-Office versions 2007 or later. In the below line, we use HSSFWorkbook as an Excel version is 97-2003.

 

2. Once we create the Workbook, the next step is to create a Sheet in the Workbook. Additionally, we can do it as below using the name of the sheet in the getSheet (String sheetName) method. Here, “STUDENT_DATA” is the name of the sheet in the Excel Workbook.

You can also create a sheet based upon the index using the getSheetAt (int index) method as shown below –

 

3. After the sheet creation, we have to obtain the row of the sheet, which we can retrieve using the getRow (int rowIndex) method of the sheet object:

 

4. Once you have got the row, you can get the cell of the row using the getCell (int index) method of the HSSFRow class:

 

5. After you obtain the cell that contains the data, you can read the data in different formats like String, Date, Number using the different methods which are based upon the format of the cell you specify in the excel sheet.

  • String – getStringCellValue()  [It can be used to read Name of the student from Excel ]
  • Number – getNumericCellValue() [It can be used to read the mobile number of the student]
  • Date – getDateCellValue() [It can be used to read the Date of Birth of the student]

Various data formats in Excel

 

The image above shows the formats feasible for a cell value in Excel(General, Text, Number, Date, Time). For an easier approach, you can specify all values as Text (even numbers) and read them in the String variable.

 

 

How to read a specific cell value?

Now that we are familiar with the different classes and the method provided by the Apache POI library, let’s try to combine them in a code snippet, where we try to read the Address of the student in the first row in our sample Excel. The Address is present in cell number 5 of the row.

NoteIndex starts from zero for both the row and cell.

Data Driven Framework-SampleAddressinExcelFile

 

You can use the below code snippet to print the address as highlighted in the above image using the methods explained above –

When we run the above program, we will get the output as follows:

Read one cell of Excel using Apache POI

 

The highlighted area shows the address of the first student that is printed using the code.

Now that we have understood how to read a particular cell value, we will now take a look at how to read the complete data from the Excel File.

 

 

How to read the entire Excel sheet?

To read the complete data from Excel, you can iterate over each cell of the row, present in the sheet. For iterating, you need the total number of rows and cells present in the sheet. Additionally, we can obtain the number of rows from the sheet, which is basically the total number of rows that have data present in the sheet by using the calculation –

RowCount = LastRowNumber -First Row Number 

To get the last and first-row number, there are two methods in the sheet  class:

  • getLastRowNum() 
  • getFirstRowNum()

So, we can obtain the row count using the below code:

Once you get the row, you can iterate over the cells present in the row by using the total number of cells, that we can calculate using getLastCellNum() method:

 

Let’s try to print the entire data present in the sheet using the below code:

The output of the code snippet is:

Read all rows using Apache POI

 

In the above image, you can see that the data present in the excel prints and also notice that Row 0 prints the Title. Moreover, you can avoid printing the title by starting the loop from value =1 instead of 0.

 

 

How to write data in Excel in Selenium tests using Apache POI?

Similar to reading, writing data in Excel files can be as important, as it can serve to save the test results back in the Excel sheets. Apache POI provides various set methods, which we can use to write the data in an Excel in Selenium tests itself. Consequently, let’s  see how we can achieve the same:

 

How to write to a new cell in an existing row?

Suppose, we want to write the result of the test run using the given test data in the same row, in which we have the input data. Consider, we just have to put a “PASS/FAIL” in the last column of the row, we can achieve the same using Apache POI as shown below:

Suppose consider a scenario that on page “https://demoqa.com/automation-practice-form“, we have the fill the student registration form by reading the data from an Excel file and then append the result in the last cell of the row, if it was successful. Subsequently, we can achieve the same with the help of using data of Excel in Selenium using Apache POI library, as shown below:

We can see the output of the above code by opening the excel file that we use in the code (in our case “E:\\TestData\\TestData.xls” ) :

Writing data to same row using Apache POI

 

As seen in the image above, PASS is written into Excel File after the test execution against the student data which was registering at the given time.

Now suppose, if we need to write the data in a new row altogether, we can also achieve the same by using Apache POI. Subsequently, let’s see how to write data to a new cell in a new row?

 

 

How to write to a new cell in a new row?

The Apache POI sheet class provides methods to create new rows and then cells in these rows. Moreover, you can create a new row in an Excel sheet as follows:

After the row creates, we can create the cells and input the data in the excel sheet, as shown below-

Subsequently, let’s write additional student data in the sheet by creating a new row in our sample Excel:

Consequently, the output of the above code will appear in the Excel sheet which looks like  –

Write in a new row using Apache POI

 

As seen in the above image, we add an additional row to the Excel Sheet having the same details as in the code.

Comments

Popular posts from this blog

Software Testing Interview Question for beginners and experienced

 Software Testing Interview Question for beginners and experienced Software Testing Interview Question for beginners and experienced- software testing interview questions INTERVIEW QUESTIONS Project specific Questions 1. What was the duration of your project? 2. Explain about your project. 3. How many testers were their on this project. 4. According to you which was the complex part of the project from testing point of view. 5. How did you do the testing of your project? 6. How many test cases have you designed? How many you wrote in a day? 7. How many bugs did you find? 8. Tell me any high Severity bugs that you found. 9. What happens when the client changes requirements? 10. Which tool you used for defect reporting? 11. What is the database used for your project. 12. In which technology this application is developed? Company specific Questions 1. Where are you working currently? 2. How that Seed has sent you here for an interview when you are working in the same company. 3. Where...

How to analyze a JMeter summary report?

  A  Jmeter  Test Plan must have listener to showcase the result of performance test execution. Listeners capture the response coming back from Server while Jmeter runs and showcase in the form of – tree, tables, graphs and log files. It also allows you to save the result in a file for future reference. There are many types of listeners Jmeter provides. Some of them are: Summary Report, Aggregate Report, Aggregate Graph, View Results Tree, View Results in Table etc. Here is the detailed understanding of each parameter in Summary report. By referring to the figure: Label : It is the name/URL for the specific HTTP(s) Request. If you have selected “Include group name in label?” option then the name of the Thread Group is applied as the prefix to each label. Samples : This indicates the number of virtual users per request. Average : It is the average time taken by all the samples to execute specific label. In our case, the average time for Label 1 is 942 milliseconds & to...

What is SQL

SQL Introduction:What is SQL SQL stands for Structure Query Language. SQL is used to create database and manipulate database. Almost every industries have their own database which is used on daily basis to maintain record of the inventory, Employee record, salary record etc. So Database is very important for all industries whether it is small industries or big industries. Without database , it is very difficult to maintain record so database is required at all place. SQL is a structure query language that enables you to work with a database. Using SQL, you can insert records, update records, and delete records. You can also create new database objects such as databases and tables. And you can drop (delete) them SQL is of two type which is frequently used in an application to save record 1. DDL(Data Definition Language) 2. DML(Data Manipulation language) Data:   Data is a fact which is related to any object For example your name, age, height, weight, etc are some data...