Merge projects seamlessly with our NEW GitLab integrationLearnmore
Software Development

How to use Selenium to automate browser interactions

Jamie Farrelly

Selenium is an incredibly powerful open source tool that was created to automate web browser interactions, which allows teams to focus on automation testing rather than manual testing. Although it can be time consuming to set up in the short term, the long term benefits will be worth it if done right. This blog post is aimed at developers who have never done any web browser automation before. This is useful to know whether you want to be able to run automation tests as part of your software development life cycle (SDLC) at the company you’re working for or you just want to automate something that you usually do manually.

I’m going to teach you how to log in to a Slack workspace, click on a channel that’s within that workspace and then send a message to the channel - all fully automated. Once you understand how this works, you can then use this as a template for the next flow that you want to automate!

Prerequisites

  • You have got a basic understanding of Java
  • You have maven installed on your machine
  • You have the latest version of Chrome downloaded (I will be running this through Chrome)
  • You’ll also need the latest version of Chrome Driver which can be found here - download this on to your machine. Chrome Driver is essentially an executable that is used to control Chrome
  • To see the flow working, you’ll need to be a member of a Slack workspace - you’ll see where you need to enter your email address and password in the code later on

I’d recommend that you look at the code that I have on this Github repository which has everything that you need including every file mentioned below and the structure of the project itself. There’s only six classes in total, which I’ll walk you through by showing you the code and then explaining what it does (note that I’ve removed import statements and so on from the code snippets to make it more readable). So, let’s get started!

This is one of the most self-explanatory classes out of the six classes. In this class you’ll just need to change the URL depending on URL you want to open up (in this case it’ll be a Slack workspace) along with changing the path to where your Chrome Driver has been downloaded. When you download Chrome Driver you’ll get a zip file, so make sure that you’ve extracted this so that you can update this class with the full path to the executable file.

The Page Object Pattern

We’re going to follow a pattern which is called the Page Object Pattern. This is an easy pattern to follow since it doesn’t have too many rules. It essentially states that you should create objects which represent the UI that you want to automate/test. For example, if you’re dealing with two pages on a website (a login page and a Slack channel page in our case), you should then have two page objects corresponding to these pages such as LoginPage.java and ChannelPage.java. By following this approach, if there are any changes made to the web page you know exactly where you need to go to update your code.

This PageObject will be the superclass of our main page objects. This is boilerplate code which sets up the driver and tells it to wait for a maximum of 10 seconds for an element. If it can’t find an element within 10 seconds, you’ll get an error so you know something has gone wrong. Note that this is an exception to the Page Object Pattern that I mentioned above - this class doesn’t map on to a web page but all of our other page objects will extend this class so it’s important to have.

LoginPage is where it starts to get interesting. If you open up the login page for any Slack workspace and log in manually you’ll notice that there’s three steps to it:

  1. Enter email address
  2. Enter password
  3. Click on the login button

Since there’s a total of three elements that we need to interact with, we’ll create three variables in the LoginPage: emailInput, passwordInput and loginButton. All of these elements have been found by their ID, which is great because sometimes websites don’t make it this easy for you which then means you have to use CSS selectors which aren’t as straightforward and are more prone to change.

Did you notice how the constructor contains a couple of lines that waits for elements? Why do you think this is needed?

The reason why these waits are needed is because we need to be 100% sure that the elements that we’re interacting with are fully loaded before we do anything with them. In this case, we’re calling the elementToBeClickable() method since we want to click on all three of the elements, but there are plenty of other useful methods that can be used for other scenarios such as visibilityOf(), textToBe() and invisibilityOfElementLocated().

Now let’s take a quick look at the clickLoginButton() method. See how it returns back a ChannelPage? This is how we tell Selenium that when the login button is clicked, we should be redirected to an entirely new page which has a new set of elements (if this doesn’t happen we’ll get an error in the console, for example: if the password you entered is incorrect then you’ll remain on the login page).

So now that we’ve logged in successfully, we’re now going to end up on another page. When you log in to Slack you land on the first channel that’s in the workspace, which may not be the channel that you want to send the message in. Because of this, I’ve created an element which is called channelLink which is the channel that we want to click on to open. The other element that I’ve created on this page is messageInput which is where we want to type the message in.

Unfortunately at the time of writing this, there were no IDs that could be used for these two elements so I had to use CSS selectors which aren’t as pretty - but it still does the job just as well.

At the stage, all of the hard work is done. Now it’s time to actually run the code. You have two options here:

  1. Run the code from a main method
  2. Run the code from a test

I’m going to focus on running it from a test since most people that use Selenium use it as a way to write automation tests to test a feature that they have developed.

We’ll use BaseTest as a superclass that our tests will extend. This class handles how we initialise and terminate the Chrome Driver. Before a test is run, it’ll set up the driver while when the test is finished it’ll also take care of deleting all cookies and then closing all the windows that are open.

Last but not least, we have our SlackTest class which extends from the BaseTest that we just went through above. The sendMessage() test opens up the Slack login page, logs in to the Slack workspace, clicks on a channel and then finally sends the message to the channel.

Your challenge

That’s it! Now that you understand a bit more about Selenium and automation testing, I have a challenge for you. How would you finish off the TODO that is in this test to make sure that the message has successfully been sent?

Here’s a hint: The ChannelPage needs another element to get the last message that’s in the channel, and remember that a WebElement has a very useful method called getText(). Enjoy!