Friday, August 29, 2014

It's yer data! - how Google secured its future, and everyone else's

Dear Google,

This is a love letter and a call to action.

I believe we stand at a place where there is a unique opportunity in managing personal data.

There is a limited range of data types in the universe, and practically speaking, the vast majority of software works with a particularly tiny fraction of them.

People, for example. We know things about them.

Names, pictures of, people known, statements made, etc.

Tons of web applications conceive of these objects. Maybe not all, but probably most have some crossover. For many of the most trafficked apps, this personal data represents a very central currency. But unfortunately, up until now we've more or less been content with each app having it's own currency, that is not recognized elsewhere.

You can change that. You can establish a central, independent bank of data, owned by users and lent to applications in exchange for functionality. The format of the data itself will be defined and evolved by an independent agency of some sort.

There are two core things this will accomplish.

1) It will open up a whole new world of application development free from ties to you, Facebook, Twitter, etc.

2) It will give people back ownership of their data. They will be able to establish and evolve an online identity that carries forward as they change what applications they use.

Both of these have a dramatic impact on Google, as they allow you to do what you do best, building applications that work with large datasets, while at the same time freeing from you concerns that you are monopolizing people's data.

A new application world

When developing a new application, you start with an idea, and then you spend a lot of time defining a data model and the logic required to implement that idea on that data model. If you have any success with your application, you will need to invest further in your data model, fleshing it out, and implementing search, caching, and other optimizations.

In this new world, all you would do is include a library and point it at an existing data model. For the small fraction of data that was unique to your application, you could extend the existing model. For example:
from new_world import Model, Field

BaseUser = Model("https://new_world.org/users/1.0")

class OurUser(BaseUser):
    our_field = Field("our_field", type=String)

That's it. No persistence (though you could set args somewhere to define how to synchronize), no search, no caching. Now you can get to actually building what makes your application great.

Conceivably, you can do it all in Javascript, other than identifying the application uniquely to the data store.

And you can be guaranteed data interoperability with Facebook, Google, etc. So if you make a photo editing app, you can edit photos uploaded with any of those, and they can display the photos that are edited.

Securing our future

People have good reason to be suspicious of Google, Facebook, or any other organization that is able to derive value through the "ownership" of their data. Regardless of the intent of the organization today, history has shown that profit is a very powerful motivator for bad behaviour, and these caches of personal data represent a store of potential profit that we all expect will at some point prove too tempting to avoid abusing.

Providing explicit ownership and license of said data via a third-party won't take away the temptation to abuse the data, but will make it more difficult in a number of ways:

  • Clear ownership will make unfair use claims much more cut-and-dried
  • A common data format will make it much easier to abandon rogue applications
  • Reduced application development overhead will increase the competitive pressure, lowering the chance of a single application monopolizing a market and needing to grow through exploitation of its users data

A gooder, more-productive, Google

By putting people's data back in their hands, and merely borrowing it from them for specific applications, the opportunities for evil are dramatically reduced.

But what I think is even more compelling for Google here is that it will make you more productive. Internally, I believe you already operate similar to how I've described here, but you constantly bump up against limitations imposed by trying not to be evil. Without having to worry about the perceptions of how you are using people's data, what could you accomplish?

Conclusion

Google wants to do no evil. Facebook is perhaps less explicit, but from what I know of its culture, I believe it aspires to be competent enough that there's no need to exploit users data. The future will bring new leadership and changes in culture to both companies, but if they act soon, they can secure their moral aspirations and provide a great gift to the world.

(Interesting aside, Amazon's recently announced Cognito appears to be in some ways a relative of this idea, at least as a developer looking to build things. Check it out.)

Thursday, April 24, 2014

PyCon 2014

I've now been back from PyCon for a week, and I've got some thoughts to share.

Scope

It was huge.

I usually try to memorize everyone's names, and I have some habits that help me with that. But there were so many people, I think that may have fallen apart. :)

A lot of hero worship, as I met, or at least observed from a distance, many people who helped shape my views on software (+Tres Seaver in particular).

Conversely, I managed to avoid running into those attending from my employers (I'm looking at you, +Kenneth Reitz, Sandy Walsh, and probably someone from RIM/BlackBerry).

Diversity

All the promotion of the diversity was terrific. At the same time that it's great to be part of a movement that is markedly more female-friendly then the tech community at large, Jessica McKellar made it clear that we have so much farther to go. As the father of two girls, it's very important to me that we change the culture around technology to emphasize that there's no particular skillset or aptitude that's required for entry.

Software is our world, and we can empower EVERYONE to play a part in shaping it.

Content Overview

I enjoyed the talks that I went to, but I did skip more than I was intending to. I had trouble letting go of work, and there was a lot of content that was relatively beginner focused, or represented tutorials that I knew had high-quality online counterparts, should I need them. I feel like this was a deficiency of my own, and one I hope I handle better if I come back next year.

Meta programming

I've been flirting with creating my own language for a while now, and if I were to do so, it would probably be on top of Python. Thanks to talks by +Allison Kaptur and +Paul Tagliamonte, I feel much more prepared to do so.

Allison provided a brilliant guide to implementing the import mechanism from scratch. Having read +Brett Cannon's blog when he created importlib, I knew there was a huge amount of work that went into getting it right, so it was an intimidating area. Yet in 20 minutes Allison walked us through getting something functional.

Paul's talk on Hy was not quite so accessible, but perhaps even more inspiring. The relative ease with which Hy and Python can co-exist within the same project is just awesome, though mucking around with ASTs remains a bit of a scary idea.

Sprints

While I was skipping talks, I consoled myself in the thought that I would really engage during the Sprints (I had a day and half scheduled for these). But I didn't, and while I think that had more to do with me (once again, I worked), I'll share what I think could have been done better, in case anyone else felt the same way.

Technically Sprints started Sunday evening, but I get the feeling that no one was actually interested in running them Sunday evening (or maybe my timing was off). There were a handful of people there, but no clear organization or plan about what was to be worked on.

Monday morning, it was certainly better attended, but it still wasn't inviting. There was a central chart of what rooms contained what projects, but within the rooms there was no indication of who was working on what. From my limited time being involved in or running other short coding sessions, I was also surprised not to see much use of flipcharts or whiteboards.

I guess how I would do it, if I ran it next year (I'm not volunteering, yet), is provide signs for each project to put at their table, and encourage each of them to write out a list of goals for the sprint in a place where it can be publicly examined and crossed off. Perhaps also provide special shirts/hats/badges to the "leader" of each sprint. The experience I would like is for someone to be able to wander in, examine what each project is doing without actually bothering anybody, and then if they find something they think could fit them, to know who to ask.

Misc.

  • Ansible is what we're using at SFX, and while I've had some experience with it, I have a much more robust appreciation for it, thanks to +Michael DeHaan
  • Peep should be included in the standard library. Seriously.
  • Asyncio makes things most people do easier. Bravo!
  • iPython Notebook is cool, and +Catherine Devlins talk about executable documentation has me itching to try it out.

Conclusion

As someone who has been around the block but doesn't find much time to actually code anymore, I may not be the code audience for PyCon. But I'm still delighted to have finally made it to one, and I'm really tempted to make it a family trip next year.

Friday, October 11, 2013

Rethinking my excuses about hiring for Test Automation


"Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it."
--Brian Kernighan

For a long time, I've made the excuse that it's hard to hire QA Engineers, etc. because of stigma based on the desired career path going from testing to development, not the other way around. That may be part of it, but I'm beginning to realize that there might be something much more significant.

Being a good Automator is HARD. As Kernighan says, debugging is hard, and automation is all debugging.

Practically speaking, the act of debugging is the investigation of a behaviour to determine it's cause. You know the software does a thing, but you don't know why, and you need to figure out how to stop it (usually). So you run it in a controlled environment and try to get that behaviour to happen.

What Automators do, most of the time, is write software that interfaces with other software. Automation is it's own integration point, but generally speaking, it's a low-priority client, so you frequently have to exploit one or more other integration points in order to get the software to behave as desired for your automation. Usually, what you want the software to do is on the margins of what it was expected to do, so you do a lot of debugging to identify and simulate the conditions that might generate the desired behaviour.

Ok, so that's a lot of horn tooting, but I've met nearly a dozen good Automators, and without exception they are fantastic software developers. That's not to say there aren't other factors that drive people away from automation, but consider this a call to action for those who want to prove their mettle as software craftspeople. 

Be an Automator, spend your days debugging everyone else's software, drive the art forward, and enjoy (in the long run) incredible job security!

As an aside, I think the headline quote is also what binds me to the Python community. The Zen of Python guides us to avoid "clever" solutions whenever possible, out of the shared recognizance that in the long run, software spends a lot more time being debugged than developed.

Wednesday, August 21, 2013

The Automator's Manifesto, Part One

In my career in the software field, I've had to teach myself about what quality software was. What the differences were between successful software projects, and unsuccessful ones. It's been a long and often frustrating journey, but I feel like I've come out of it with some principles that could be helpful in guiding people in the field.

My principle strength in software development is pattern recognition, especially behavioural. I've used this over the years to refine my approach to software development, by analyzing my own behaviour and that of those around me. For this topic, I've tried to apply that at a higher level to my experience with test automation. What is it about automation that most consistently delivers value? What consistently doesn't? Are there practices can guide us towards the former and away from the latter?

Certainly, to a degree, everything here is derived from common software best practices, but I believe I am able to tailor them in a way that is uniquely suited to the automator. That's why this part is sub-titled…

Automation is Software

(I use the term "automator" as a generalization of test automation developer, quality engineer, software engineer in test, and a dozen other titles).

 

Apply Best Practices, as best you can

I'm sure we've all heard the statistics about how 50%, 75% or 90% of all software projects fail in some way. If we've been around the block, we may even understand some of the steps to take to improve our chances, commonly called "best practices". Automation projects are almost always directly dependent on some other software project, and when they go down, the automation often goes with them.

So from the start, the odds aren't good with your automation. But then consider that of all software projects, automation seems to be the place where best practices are least… practiced. I'm talking about things like:
  • Requirements gathering
  • Structured development process
  • Documentation
  • Unit, etc. tests
  • Source control (I've seen this missing several times)
Not to say that every piece of automation needs all of these, just that an automation project is like any other software project, and if we fail to apply common knowledge about software projects to it, we are shooting ourselves in the foot.
What's more, by practicing these things, you become better equipped to ensure they get applied to the project you are developing automation for, primarily because it can help you establish credibility with the developers working on it. Walking the walk, and all that.

Have a Client

A key aspect of most successful software projects is the presence of an engaged client. As in most aspects, Automation is no different. Someone who understands the value that you are supposed to be delivering is in the best position to help you understand what needs to be done to deliver that value. Do not simply write tests because there was a decree "we need tests for X".

Instead, you need to understand where the desire for tests comes from? Is it that developers are hesitant to make changes because they have no safety net? Is it that there are lots of regressions in production? Is it that there's a performance optimization project underway, and metrics are needed? Each case would suggest a different focus, and there are many more cases.

But in order to get to this point where you have identified the need, and make progress against meeting it, you need a partner who will vet requirements and validate progress. You need a client. This can be a team of developers, a particular developer, someone in management, or others. But they need to understand the value you are expected to deliver, and have the time to confirm that you are doing so.

Fight on the best terrain available to you

As I believe is common to anyone who has been in this field for a while, actually automating tests is not always the best way you can improve the quality of the software. As described in "Test is Dead", many companies have user bases that accept being used for beta testing. Sure, it might not always be so, but if that is the situation you find yourself in, attempting to keep bugs from those beta testers will not be an ideal use of your time.

A robust functional test suite for a rapidly changing product might cost more in maintenance then it delivers in the first few years. A development team without a mature development process will gain little by forcing them to write unit tests.  Sometimes there's more value to be gained in doing SysAdmin or DevOps type work, such as creating continuous integration infrastructure. The important thing about being an automator is establishing processes that will make producing higher quality software easier. Whether those are actual automated tests run against software, or merely conventions established to improve communication during development is not important. What is important is that you make producing quality software the easiest thing for the developers.

In order of priority, I would suggest:
  • Establish a development process
  • Establish a development environment
  • Automate tests
  • Automate monitoring
Each of these makes the following item easier and more valuable.

Summary

Automation is Software, so:
  • Follow best practices, as best you can
  • Have a client, who can validate the value you deliver
  • Understand the terrain of the project you are automating against, and tailor your focus to that
Join me next time as I dive into the value of Proactive Transparency, the danger of Entropy Factors, and the deceptive bliss of Measuring Everything.

Monday, April 29, 2013

Beyond GTAC

In 2013, in the twilight of my "testing" career, I was able to attend Google Test Automation Conference for the third time. Once again, I was impressed by (and jealous of) many of the solutions presented. But for me at least, I'm not sure I'm the target audience anymore.

About me

Before I get into that, let me expand on what I said above about twilight. The arc of my career is the story of trying to figure out how to use technology to improve people's lives. This began as computer and then network support, making sure people were able to use the tech they already had. From there, I began to write software, for myself and then those who would pay me to do it.

Throughout this process, I have had to constantly analyze and refine my own processes. I made mistakes, researched best practices, and tried again. Everything I do comes from continual experimentation and optimization. My career transitions also come out of this drive for self-improvement.

As I wrote and used more software, and especially assisted others in using it, it became (gradually) clear that the only useful measure of quality was whether it made it easier for users to accomplish their goals.

Most recently, that career has led me to look beyond myself in terms of delivering quality software. I'm now responsible for optimizing the output of a team of fantastic developers. In the majority of situations we currently encounter, writing tests is not the most efficient path towards this.

But enough about me

So, to a certain degree, GTAC may not be for me because I need to take a wider view then many of the participants, even though we share the same goals. But I think there's more to it then that. The people at GTAC are on average likely to be the best test engineers on the planet. Yet with few exceptions, the work they presented would have fit right in with GTAC from 4 or 6 years ago. As a community, we are failing to establish and exploit institutional memory, so we keep having to re-solve the same problems.

As a specific example, Google themselves have an incredible build and release system, as the result of a huge ongoing investment. From what we saw of every non-Google presentation, other companies capabilities for automated testing are hampered by the lack of this kind of infrastructure. Not to imply that Google has any obligation to provide it, but it seems to me that everyone could benefit substantially from more direct collaboration on it. Not to say the tooling in this area doesn't hasn't improved, but it still seems like a problem that every company continues to try to solve in their own way.

Don't get the impression from this that GTAC itself is anything less than fantastic. It has helped me grow so much, and likely raised my expectations for what should be possible. If you have any sort of passion for automated testing, there would be few things more valuable than connecting with this community. Also, despite my gripings, there are some things happening today that simply wouldn't have been possible in the recent past.

About those exceptions

Selenium. Not to be trite about it, but the Selenium project is the biggest change to the test automation community over the past 6 years. It's gone from being a nifty tool for somewhat flaky web automation to a standard (literally) platform upon which we can converse about automation and build other tools. Some examples of this include Appium, which is leveraging it to provide cross-platform mobile app testing, and Marionette, which is integrating Selenium into builds of Firefox OS. It's not the end of the story by a long shot, but having Selenium allows us to elevate our dialogue around test automation.

Misc. Buck, by Facebook, seems like a nifty solution to a problem I almost have (though I'm not sure when +Simon Stewart finds time to stop being funny and start writing software). Almost everyone agrees you should just kill flaky tests, which is good, but I'm not sure it's the endpoint in that conversation. Getting developers to write tests was also a near-unanimous sentiment (and why wouldn't it be in this group?), though unlike Thomas Petersen, I didn't get the impression that it was intended to exclude testers from the development process. Pex4Fun is just cool. The Android UI testing team embedding their automation commands in the build thread seems a bit crazy, but still a really clever way to avoid the whole click-and-sleep cycle that plagues most UI automation.

Organization. Despite having a more limited experience of the Google office, I feel like this was by far the best run conference I have ever been to. Things were on schedule, very smooth, and +Anthony Voellm is one heck of an MC.

Summary

GTAC remains in my mind the place to be if you are interested in doing test automation. I believe there's more that the test automation/software quality community needs to do to grow together, but the fact that this conference isn't all about that is mostly a reflection of my changing priorities, rather than any failings on their part.

Full disclosure, I spent 3 years in automation at RIM BlackBerry, and this was a mobile testing conference, so I'm probably just jaded.

Friday, March 9, 2012

Recursively formatting a dictionary of strings

I have a use-case where I'm using a dictionary as a basis for generating a configuration file, and many of the values in the file share pieces, such as URL bases or credentials. I will also be generating configuration files for several different instances from the same dictionary.
I wanted a simple way to define replaceable parts of the dictionary. I made a couple of Google searches, didn't find anything on the first page, so I wrote this:
I hope it's useful to someone else, or that I find it next time I have this use-case. :)

Thursday, March 8, 2012

Shoutout to PyDev

I've been using PyDev for many years, for Python and Jython development of all kinds. The more recent releases have made some big usability improvements. Specifically, autocompletion works in many more cases, and the built-in refactoring support is proving handy. Rough edges that have bugged me for years are being steadily smoothed away.
So thanks for all the free stuff Fabio! If I was at PyCon I would buy you a drink.