Geeks logo

Why Gherkin (Cucumber, SpecFlow,…) Always Failed with UI Test Automation?

Avoid this recipe for test automation failure. Use RSpec instead.

By Zhimin ZhanPublished 9 months ago 6 min read
1

Many software projects tried or are trying to use Cucumber for test automation, commonly with Selenium Driver for testing web apps. Some might wonder whether my title is just a personal and radical view for attention. No, I just reworded the view from Aslak Hellesøy, the creator of Cucumber:

“So really, what is Cucumber? As a test tool it sucks. There far better automated test tools” (source)

Some Gherkin fans may say: “This might be one mis-comment”. Oh well, here is another one on Aslak’s home age.

“If all you need is a testing tool for driving a mouse and a keyboard, don’t use Cucumber. There are other tools that are designed to do this with far less abstraction and typing overhead than Cucumber.” (source)

With my 10+ years of test automation and continuous testing consultation, every test automation attempt with Gherkin-style syntax (Cucumber, SpecFlow, JBehave, Concordion, Gauge, Spinach…) failed, with no expectations! The biggest failure I heard, “The project spent 3 times of development efforts (measured time and money) trying to maintain those cucumber tests, eventually, dumped to the bin!

Why does Cucumber fail in test automation? Let’s hear from DHH (Creator of Ruby on Rails, Founder & CTO at Basecamp & HEY, NYT best-selling author):

DHH is correct, as usual. I have never met one business customer who actually read (or even run) cucumber tests.

DHH’s tweet in March 2011

DHH is correct, as usual. I have never met one business customer who actually read (or even run) cucumber tests.

Once we established that, it is clear that Cucumber tests have little value for customers (and, therefore, business analysts). From the technical perspective, the effort to support the extra layer of test-specific parser for English is going to cost the team a lot. (please note, again, there are no or little business values of doing that).

Some might still argue: “I disagree. The creator of Cucumber and DHH were wrong, I implemented Cucumber successfully on the last project”. For every person who said this to me, and when I had the chance to assess their work, just all plain lies. Think about it, “the last project”, how about this project? Show us. For web test automation, the knowledge is pretty much fully transferable. For every new project, I created at least one working core test (given the environment is ready) and mostly ran in a Continuous Testing server (e.g. BuildWise), on the first day (yes, you read right).

What is the main technical reason for Cucumber (i.e. Gherkin) failing on test automation? Test Script Maintenance. The first few test cases are easy to do and relatively simple, and static (e.g. login, sign up). Some get excited, executable specifications that like English, woo-hoo! With more tests, as we know, test cases may get more complicated, and the maintenance effort (existing and new) will grow quickly (exponentially if lacking the tools and capability to meet the challenges). That’s the case for all UI test automation, even with good syntax frameworks such as RSpec.

If someone thinks maintaining automated UI tests is easy, please read this interview from Microsoft Test Guru Alan Page: “95% of the time, 95% of test engineers will write bad GUI automation just because it’s a very difficult thing to do correctly”. If you think GUI automation is easy, then you are at 1 out of 400 SET (software engineers in Test) level, according to Alan Page. Please note, this is judged by the Microsoft Engineer standard. How many test engineers in your city (or state) can meet the standard? Not many, right? Only 1/400 of them can do real GUI test automation well, and they probably won’t think it is easy. This is not just Microsoft’s view, Google VP Patrick Copeland said this: “In my experience, great developers do not always make great testers, but great testers (who also have strong design skills) can make great developers. It’s a mindset and a passion. … They are gold”.

Cucumber tests will require more maintenance efforts, with that extra useless (except for demos) layer, a lot more. That’s why, from my knowledge over 10 years, every Gherkin automation failed.

A recent big failure was at a large finance organization (claiming Agile for over 12+ years) in my city. Its Gherkin solution (in Java) was so bad that about 3 times of all development effort (time and money) were spent on trying to maintain those ‘Gherkin BDD tests’. Of course, eventually, the tests were abandoned. The excuse the management used was “the contractors worked on this left, so failed to maintain”. Of course, this was not the truth. The root problem was a bunch of mediocre programmers, who mistakenly over-estimated their knowledge of test automation, and made a bad choice based on a naive idea of executing “Given-When-Then” user stories that Business Analysts wrote. Sadly, these kinds of mistakes are keeping repeating.

Besides the human factors, what are the technical reasons why DHH and the creator of Cucumber are against BDD with Gherkin tests? Below is a comparison of the test tiers (based on Maintainable Automated Test Design) between a good test syntax framework RSpec and the bad Gherkin (when used for test automation).

The extra effort (right graph) comes from the ‘test-specific English parser’, the part DHH was referring to. Let’s look at an example Cucumber test.

1. Test (Gherkin) Layer

2. Step Definitions Layer!

3. Helper and Pages Tier

Helper: support/step_helper.rb, included in support/env.rb

Page class: pages/flight_page.rb

Please note the helper and page classes, if designed well, can be 100% reusable, regardless of what the top syntax framework you use, Cucumber, Capybara or RSpec.

As a comparison, below is the test (top) layer for RSpec.

There is no middle tier (helper/page class tier is the same), therefore, the test script is much easier to maintain. You can get the above test scripts from Github. For a more comprehensive example, see this article: WhenWise Regression Test Suite Reaches 500 Selenium tests and ~300K Test Executions.

RSpec is the most popular “Behaviour Driven Development for Ruby”. RSpec v3.8.0 alone has over 193 million downloads on RubyGems. While RSpec may also be used for unit or integration tests, its download count is quite impressive. As a comparison, the most-downloaded Cucumber v3.1.2 is merely 8.8 million.

Cucumber is not the first failed test framework that uses English-like syntax for automated testing (it may be for other uses, but definitely not real test automation). Do you still remember FitNesse (it was quite big about 10 years ago, an example here)? Now it is hardly mentioned.

Some frustrating Gherkin ‘test engineers’ might grudge: “Maybe you just don’t understand Cucumber”. Sorry, I do know Cucumber well.

  • Cucumber was developed in Ruby; I am a winner of the 10th Ruby Award. I also worked for many years as a senior software engineer (contractor) using Java, C# and JavaScript.
  • TestWise, a next-gen functional testing IDE I created, supports Cucumber. (Kent Beck, the father of Agile, once said: I hated the idea so I had to try it.’)
  • BuildWise, an international award-winning Continuous Testing server I created, supports executing Cucumber tests too.

It shall be fair to say that my Cucumber/Gherkin knowledge is better than most ‘cucumber automation engineers’.

Real functional test automation is far more than a fancy demo. If you truly believe Gherkin automation tests are the way to go, please do it well, don’t ruin the reputation of test automation. Make test automation visible and relevant to the team daily, enabling the team to release to production multiple times a day. That’s what I can do with raw Selenium WebDriver tests in RSpec for my web apps: ClinicWise, SiteWise, and WhenWise.

If you are an architect/manager of an organization doing Cucumber Test automation (usually fooled by fake agile coaches), I suggest you write a concerned email but not too direct and save it (that’s important) or even start preparing your scapegoat. When the shit hits the fan, you can say: “I told you so” or “that fake agile coach’s fault’.

----

This article was originly published on my Medium blog on 2021-01-27, featured on "The Startup", the largest publication on Medium.

industryhow to
1

About the Creator

Zhimin Zhan

Test automation & CT coach, author, speaker and award-winning software developer.

A top writer on Test Automation, with 150+ articles featured in leading software testing newsletters.

My Most Viewed Articles on Vocal.

Reader insights

Be the first to share your insights about this piece.

How does it work?

Add your insights

Comments

There are no comments for this story

Be the first to respond and start the conversation.

Sign in to comment

    Find us on social media

    Miscellaneous links

    • Explore
    • Contact
    • Privacy Policy
    • Terms of Use
    • Support

    © 2024 Creatd, Inc. All Rights Reserved.