I don't know if this calculator was good or bad, but the rationale sounds superficially ridiculous.
Visitors who didn't see the calculator were 16% more likely to sign up and 90% more likely to contact us than those who saw it. There was no increase in support tickets about pricing, which suggests users are overall less confused and happier.
Of course if you hide the fact that your product might cost a lot of money from your users, more of them will sign up. Whether they are better off depends on whether they end up getting a bill they are unhappy with later at some unspecified future date, or not. That's not something you will figure out from a short-term A/B test on the signup page. So this seems like totally useless evidence to me.
I see this dynamic frequently with A/B tests. For example, one of my coworkers implemented a change that removed information from search result snippets. They then ran an A/B test that showed that after removing the information, people clicked through to the search result page more often. Well, obviously, it makes sense that they might click through more often, if information they wanted which was previously in the snippet, now requires them to click through. The question of which is actually better seemed to have been totally forgotten.
Absolutely true! An A/B test enthusiast in our team once significantly reduced padding on the pricing page to bring the signup button above the fold and used the increase in signup button clicks as a proof of success of the experiment. Of course, the pricing page became plain ugly, but that didn't matter, because "Signups are increasing!!"
In this case, I do agree that the calculator is a bit daunting if you're not used to all the terms, but what should be done with it should have been an intuitive decision ("what can we do to simplify the calculator?") Not a fan of A/B testing culture that everything needs to be statistically analyzed and proved.
I'm not sure I'm following you here, so perhaps you'd care to elaborate?
The GP critique was that it was perhaps just creating a problem elsewhere later on. I'm not seeing the similarity to your case where the change is cosmetic not functional.
The issue of whitespace (padding) is subjective (see the conversation recently between the old and new windows control panels) but "scrolling down" does seem to be something that should potentially be avoided.
If sign-ups are increasing is that not the goal of the page? Is there reason to believe that the lack of padding is going to be a problem for those users?
I think one problem is that a better design would move the button above the fold without ruining the spacing, and therefore achieve a better result with even higher lift, but someone focused on just the numbers wouldn't understand this. The fact that the A/B test has a big green number next to it doesn't mean you should stop iterating after one improvement.
It still seems like a valid use-case for AB testing. Ideally, you should maybe redo the design, something you could AB test if it helps.
My guess is yes, because consistency in design usually makes people assume better quality.
A/B tests suck because you are testing against two cases which are probably not the best case. If you take your learnings of the A/B test and iterate your design that's a viable strategy but proposing a shit design and insisting on deploying is wrong.
That's like saying that comparing a veggie burger to a normal burger sucks because neither are ice cream.
A/B tests, by definition, test between A and B. It is very likely that neither is the best option.
But how will you find the best option if you don't measure options against each other.
The problem is that in 90% of companies the decision is between A/B and not the iterations of them.
I'm going to assume the 90% number was simply hyperbole. Because it's trivially false in any number of ways;
Firstly many businesses have never heard of A/B testing, much less apply rigorous application of it to proposed changes.
Secondly many businesses have found their niche and don't change anything. There's a reason "that's not how we do it here" is a cliche.
Thirdly a whole slew of businesses are greater changing things all the time. My supermarket can't seem yo help themselves iterating on product placement in the store.
Blaming testing in general, or A/B testing specifically for some companies being unwilling to change, or iterate, seems to be missing the actual problem.
Frankly, with regard yo web sites and software I'd prefer a little -less- change. I just get used to something and whoops, there's a "redesign" so I can learn it all again.
Okay - but the design without padding converted better than the one with padding.
A/B tests don't give you the best possible choice of all choices. They just give you the better choice between A & B.
The business shouldn't delay rolling something out that increases conversions *significantly* because a better design *might* exist.
You can A/B test "better" designs in the future until one converts better.
A/B testing has something to say about (ideally) a single choice.
It has nothing to say about optimal solutions. Nothing about A/B testing suggests that you have reached an optimal point, or that you should lock in what you have as being the "best".
Now that the button position (or tighter layout) has been noted to have a material effect, more tests can be run to determine any more improvements.
The issue is that A/B testing only looks at outcomes, not reasons. There is a possibility that having the sign-up button above the fold wasn't the contributing factor here, or that it was only a contributing factor. Having to scroll through an estimate may lead a potential customer to believe that pricing is too complex or, worse, that the vendor is trying to hide something. Perhaps there are other reasons. The problem is that A/B testing will only tell you the what and not the why.
I feel like that example is missing some context - if signups did increase then their experiment was successful - we aren’t here to make pretty pages, we’re here to make money.
The problem is that it's easy to prove that signups are increasing, and lot harder to prove that there was a measurable increase in number of paying users. Most A/B tests focus on the former, very few on the latter. We had a free plan, and most users who signed up never made a single API request. So, assuming that the increase in signups is driving more business is just foolhardy.
You can always track signup/paying-users ratio. Purpose of landing/pricing page is to get the users to sign-up. Unless some dark pattern or misinformation is used to confuse users into sign-up, more users is a positive thing.
The problem with A/B test optimization is that, unless you're extremely careful, they naturally lead you to apply dark patterns and misinformation.
Ok, but the example we're discussing is one where the signup button was simply moved to a different position on the page. That's not a 'dark pattern'.
That doesn't sound like a signup problem; what was the goal behind the free plan? Drive more paying users? Raise company profile? Lock in more users?
Okay? The A/B test sought to measure which of two options A and B led to more signups.
Your "A/B test enthusiast" was not testing for or trying to prove a causal relationship between increased signups and more business.
If he made the claim separately, then that is the context that is missing from now multiple comments.
If I had a choice between ugly and rich and pretty and poor I'd be sorely tempted by ugly, particularly if I was making the decision for an organization.
The problem with their calculator was that the users introduced slightly wrong data, or misunderstand what means some metric, and suddenly a 1000x the real price was shown. Their dilemma was "how to fix those cases", and the solution was "get rid of the messy calculator".
But they are not hidding a 1000x cost, they are avoiding losing users that get a wrong 1000x quote.
why not fix the calculator in a way that avoids/mitigates scenarios where users get to wrong quotes and then do an A/B test? This setup seemingly tilts towards some sort of a dark pattern IMO
Because the results were probably wrong because the inputs were wrong (exagerated by over-cautious users). There is no automated way to avoid that in a calculator; only a conversation with a real person (sales, tech support) will reveal the bad inputs.
I wonder if some of that could have been automated. Have a field to indicate if you are an individual, small business, or large business, and then at least flag fields that seem unusually high (or low, don’t want to provide too-rosy estimates) for that part of the market.
They tried to mitigate :
Let's not take a PR piece completely at face value.
There's probably a bit of both, at the very least.
In my mind Pinecone is an exemplary example of modern "social media marketing" for a technology company.
They started on vector search at a time when RAG in its current form wasn't a thing; there were just a few search products based on vector search (like a document embedding-based search engine for patents that I whipped into shape to get in front of customers) and if you were going to use vector search you'd need to develop your own indexing system in house or just do a primitive full scan (sounds silly but it's a best-case scenario for full scan and vector indexes do not work as well as 1-d indexes)
They blogged frequently and consistently about the problem they were working on with heart, which I found fascinating because I'd done a lot of reading about the problem in the mid ought's. Thus Pinecone had a lot of visibility for me, although I don't know if I am really their market. (No budget for a cloud system, full scan is fine for my 5M document collection right now, I'd probably try FAISS if it wasn't.)
Today their blog looks more than it used to which makes it a little harder for me to point out how awesome their blog was in the beginning but this post is definitely the kind of post that they made when they were starting out. I'm sure it has been a big help in finding employees, customers and other allies.
Thank you. :) I don’t think of it as social media marketing but more of helping our target audience learn useful things. Yes that requires they actually find the articles which means sharing it on social, being mindful of SEO, and so on.
Probably our learning center is what you’re thinking of. https://www.pinecone.io/learn/ … The blog is more of a news ticker for product and company news.
Being a complete cynical bastard here but I sometimes feel like these calculators are actually meant to obfuscate and confuse and the result is that a startup worried about scale is going to pay over the odds and then deal with ‘rightsizing’ after the costs get out of control.
I felt like that with elastic serverless’ pricing calculator which on the surface looks perhaps cheaper or more efficient than a normal managed cluster, because you think it would be like lambda. Except there are so many caveats and unintuitive hidden costs and you’ll likely pay more than you think.
Can't speak for everywhere of course, but the places I have worked nobody likes spikes or over commitments. The customer is shouting at your people, salespeople and support spend time and get stressed dealing with them, leadership gets bogged down approving bill reductions. Even if granted, customers remember the bad experience and are probably more likely to churn
My cynical take: I make things that look hard to make to impress you but if you make them for me I feel my money is going into the calculator rather than the product.
I designed an internal system that optimises for long term outcomes. We do nothing based on whether you click “upgrade”. We look at the net change over time, including impact to engagement and calls to support months later and whether you leave 6 months after upgrading. Most of the nudges are purely for the customer’s benefit because it’ll improve lifetime value.
You could only be measuring in aggregate, no? Overall signal could be positive but one element happens to be negative while another is overly positive.
Well, adjusting nudges in aggregate but diced in various ways. Measured very much not in aggregate. We’d see positive and negative outcomes roll in over multiple years and want it per identifier (an individual). I’ve heard of companies generating a model per person but we didn’t.
A silly amount of work but honestly lots of value. Experimentation optimising for short term goals (eg upgrade) is such a bad version of this, it’s just all that is possible with most datasets.
That's the only thing I was thinking with their A/B test. The calculator might immunize against unhappy customers later on. I think they could've looked at something like the percentage of customers who leave one or two billing cycles later.
Relatedly, there seemed to be no acknowledgement of the possibility of dark incentives: many businesses have found they can increase sales by removing pricing details so that prospective customers get deeper into the funnel and end up buying because of sunk time costs even though they would have preferred a competitor. Example: car dealerships make it a nightmare to get pricing information online, and instead cajole you to email them or come in in person. In other words, a calculator makes it easier to comparison shop, which many businesses don't like.
I have no idea if that's a conscious or unconscious motivation for this business, but even if its not conscious it needs to be considered.
Would you consider doubling your prices so users perceive your product as having higher value a dark pattern?
Only if this were untrue, i.e., I was motivated by the fact that it made my customers believe my product was better than a worse product.
For me the principle is based on not exploiting the gap between the consumer and a better informed version of themselves. (“What would they want if they knew?”) There’s a principle of double effect: I don’t have to expend unlimited resources to educate them, but I shouldn’t take active steps to reduce information, and I shouldn’t leave them worse off compared to me not being in the market.
Feels like it kinda fits under fake social proof. https://www.deceptive.design/types
How is that a function of the overly-simplified and often-wrong calculator?
If the user is never there to be happy/unhappy about it in the first place, then how would you test this anyway?
By closing the loop and increasing engagement, you are increasing the chance that you can make the customer happy and properly educated through future interactions.
The author was very careful with their words: they didn't say the calculator was wrong. They said it was confusing and sensitive to small adjustments. It's likely that the same confusion and variable sensitivity exists during usage. IMHO they should have bit the bullet and revised the pricing model.
Fair point. The author commented elsewhere here and stated that it's not the usage but the understanding of the variables in the calculator which are often wrong by more than 10x. From the response, it seems like the only way to know how much something will cost is to actually run a workload.
Edit: if the customer is getting a wrong-answer because of wrong-inputs, IMO, it's still a wrong-answer.
I don't know enough to agree/disagree because they may be offering close to at-cost which might give better overall pricing than competitors. It's a complex-game :)
This is a blind spot for pretty much entire industry, and arguably spreads beyond tech, into industrial design and product engineering in general. Of course being transparent with your users is going to be more confusing - the baseline everyone's measuring against is treating users like dumb cattle that can be nudged to slaughter. Against this standard, any feature that treats the user as a thinking person is going to introduce confusion and compromise conversions.
Essentially the failure is that we do treat users like this by relying on mass collection of data instead of personal stories. To be human is to communicate face to face with words and emotions. That's how you can get the nuanced conclusions. Data is important but it's far from the whole story.
That’s why you need domain experts and clear explanations and hypotheses before you experiment, otherwise you’re throwing shit at a wall to see what sticks.
Companies can continue to monitor cohorts to compare retention to check the potential outcomes you highlighted.
They should have looked into this to see how to make it more obvious or more reflective of "regular use case"
Their sliders there are not too detailed. For example, what are namespaces, how many would a typical use need? Is 100 too much or too little? And if this is one of the variables that is too sensitive they would need to represent this in a different way
There are multiple companies on my blacklist that definitely got me to sign up. But as there was a hook that anybody acting as a trustworthy partner would have mentioned, I parted with them — potentially for life. You know, things like "click here to sign up, sacrifice your newborn on a fullmoon night while reciting the last 3 digits of pi to cancel"
I don't particular care whether their A/B test captures that potential aspect of customer (dis)satisfaction, but I am not sure how it would.