Five different types of fried cheese @coopcoopbware - Tumblr Blog

Wait and Hope

After 15 years, today is my final working day at Mozilla. When people leave Mozilla, they frequently exercise their privilege to send one final email to the entire company saying goodbye. I've elected not to do that and am instead posting my thoughts here. Call it hubris, but there aren't many people left at Mozilla who can appreciate what 15 years means. Most of my colleagues have already moved on. 2020 has been hard. Layoffs at Mozilla, and the threat of more layoffs, made this a particularly rough year. As a manager, putting on a brave face for others has left me emotionally spent at the end of every week. This is on top of the malaise associated with a decade of declining market share (and associated relevance) for Firefox. As I reach the end of my tenure at Mozilla, inevitably I look back to try to figure out what I could have done differently to make Mozilla more successful. Did I miss a window of opportunity somewhere to help Firefox succeed? Might this year have been avoided, or its impact softened? In broad strokes, sure, I could have worked longer or harder, pushed to get projects completed faster or to a higher standard. More specifically, if we had accelerated our transition from tinderbox to buildbot, or from buildbot to Taskcluster, could we have kept better pace with competitors? Maybe we could have recognized the scaling needs sooner and avoided migrating our entire continuous integration infrastructure twice? The safe answer is that, yes, there are many things I could have done differently, but hindsight is also 20/20. When I started this reminiscence, I felt like maybe my impact had decreased over time. It was tempting to think that my influence peaked back in 2005 when it was just 25 of us hacking together on Firefox under the Can Bridge in Ellis St. But that's absolutely not true. Mozilla, at its core, is about people. The manifesto is an invitation. This is a long game; the changes that Mozilla wants to affect in the world aren't best measured in quarterly earnings reports. As a manager at Mozilla, I've had the opportunty to hire dozens of people. I've helped interns develop into kick-ass engineers. I've touched the careers of countless people and hopefully instilled some fundamental values along the way. Many of those people are no longer with Mozilla. This is a good thing, both for them and for Mozilla. The world needs more Mozilla. In an industry largely bereft of introspection and in many cases lacking a moral compass, the Mozilla diaspora has some serious work to do. At the end of the day, if all I've done is helped spread Mozilla values out into the wider world, I'm happy with that legacy. Mozilla has gone through big changes this year. I don't know if those changes are enough for it to be successful, but I am hopeful. As part of the old guard, I am happy to step aside at this juncture to create space and opportunity for the new guard in my stead. I'm starting a new adventure as a Senior Development Manager at Unity in January. I'll be taking my Mozilla values with me.

#Mozilla #Unity #transition

•18+ Adults Only

Watch Anya Live on Cam

Anya is live and ready to show you everything. Watch her strip, dance, and perform exclusive shows just for you. Interact in real-time and make your fantasies come true.

✓ Live Streaming✓ Interactive Chat✓ Private Shows✓ HD Quality✓ Free Actions

Free to watch • No registration required • HD streaming

Taskcluster: CI for Engineers

My team at Mozilla has been working towards something special for over two years.

When I joined the team, we felt that we had a pretty good internal product in Taskcluster, the task execution framework that supports Mozilla's continuous integration (CI) and release processes. It served Mozilla's CI needs well, and was scaling admirably compared to previous solutions.

But could it be more?

Would people outside of Mozilla benefit from Taskcluster, and could they deploy it? Perhaps more importantly, could we develop a community of users around Taskcluster that would be self-sustaining?

We were determined to find out.

We started by taking a hard look at the Taskcluster platform and found a few big impediments to wider adoption. First, we would need to reduce the setup complexity. We would also need to reduce the number of cloud accounts required to get started. At the time, Taskcluster required at least two separate cloud providers (AWS and Azure) and a Heroku account to launch.

Over the past year, we removed the need for Azure as a back-end data store and removed the need for Heroku for deployments. Now if you have a Kubernetes environment setup, you're ready to install Taskcluster. You'll still need AWS S3 access for artifact storage, but we're working to make that configurable too.

While we were making all these changes behind the scenes, we were thinking about how we would actually try to garner more interest in Taskcluster outside of Mozilla. In true Mozilla fashion, we have always been developing Taskcluster in the open, but that doesn't necessarily mean we were discoverable. How could we specifically target the kinds of users who would benefit the most from Taskcluster?

Out of the blue in August, a developer from a mobile game company contacted us to let us know that she had successfully deployed Taskcluster, and had a few suggestions for improvements, complete with patches.

Just like that, Taskcluster was in the wild.

Is Taskcluster right for you?

From talking with Ricky, the co-founder and principal programmer at Well Played Games who was the first to successfully deploy Taskcluster outside Mozilla, we learned a lot about the decision points that might lead someone to choose Taskcluster:

Taskcluster has given us more flexibility than any of the CI solutions we've used in the past. It is well engineered, letting us easily pick and choose the components we need, and quickly replace any that don't suit our use cases. Its native support for Kubernetes meshes perfectly with our tech stack.

Ricky Taylor, Co-Founder, Well Played Games

So, is Taskcluster right for you? The short answer is "maybe."

If your build and test pipeline is straightforward, there are simpler solutions out there for you. If you only support one platform, there is probably a more targetted solution for your use case.

However, if your CI needs are more complex, Taskcluster may be exactly what you need.

Here are some examples of use cases where Taskcluster might make sense for you:

You already have a person or team of people dedicated to your CI pipeline.

You currently support >1 CI system, probably for different platforms.

You have on-premise or custom hardware that you need to integrate into your CI pipeline.

Your current CI system is hitting a bottleneck or ceiling.

You are considering writing your own bespoke CI system to address any of the above concerns.

All of those are pretty good indications of CI complexity in our experience.

We've adopted "CI for Engineers" as the tagline for Taskcluster. Taskcluster will not solve your CI problems on it's own, out-of-the-box, but a software engineer who understands your CI needs can make it do just about anything.

Better still, your software engineer doesn't have to go it alone. We're already building a community of Taskcluster users who can offer support to each other. Ricky from Well Played Games has already contributed new features that have been incorporated into Taskcluster and consulted on others.

If Taskcluster seems like a good fit for your CI needs, we encourage you to join other Taskcluster users and developers in Matrix or in Discourse.

Home page: https://taskcluster.net/ Documentation: https://docs.taskcluster.net/ Code: https://github.com/taskcluster/taskcluster/ Matrix: https://chat.mozilla.org/#/room/#taskcluster:mozilla.org Discourse: https://discourse.mozilla.org/c/taskcluster/

If you'd like to investigate a live instance, the Mozilla Taskcluster deployment for community projects can be found here (no sign-in required): https://community-tc.services.mozilla.com/

Taskcluster - CI for Engineers

#Mozilla #Taskcluster #continuous integration #ci

Managing: team expectations

One of my biggest challenges when I began managing the Taskcluster team was simply getting my reports to talk to each other in a productive way. Per Conway's Law, the micro-service architecture of Taskcluster reflected the knowledge silos on the team. Communication was erratic at best. Fortunately, transitions offer an ideal opportunity to establish norms, revisit old ways of work, and perhaps even try something new.

Following the lead of a colleague, at the start of my tenure I sat my new team down for an entire day and hashed out the communications issues. What emerged at the end of that discussion was a document that recorded all the expectations we had for each other as teammates. The document was part aspiration and part contract, but was essential for establishing a baseline of trust that we could use to work together going forward.

Fast-forward to 2020, and Mozilla is going through yet another transition. As the make-up and scope of my team changes again, it is useful to revisit the expectations document to make sure everyone is still on the same page. After consulting with the team, I also decided to publish our Team Expectations doc on Github in the hopes that it might benefit others.

Taskcluster Team Expectations

This is partially self-serving: the Taskcluster team has many community contributors and the occasional intern, and we hope that by sharing our expectations more widely, we'll foster a better contribution environment around our project.

If you're interested in performing a similar exercise with your own team, be prepared to devote the time. I'd budget at least 4 hours to this process, depending on your team's current level of dysfunction. We had the fortune in the before times to be able to do this in-person, but a series of video calls would accomplish the same goal.

Content-wise, the headings we came up with — Accountability, Communications, Planning and The Design Process: RFCs, Implementation and Review, Triage, Dealing with outages — are a good jumping-off point for the discussion but may or may not make sense depending on your field or responsibilities. Having done the process with a few different teams now, it's important not to over-structure this at the start. There is a lot of value here in digression because that's where you are most likely to find the areas where expectations are currently mismatched or unmet.

If you do try out, please let me know how it went, especially if you end up evolving the process. Hopefully it meets your expectations. ;)

#management #Taskcluster #expectations #Mozilla

New to me: the Taskcluster team

All entities move and nothing remains still.

-- Heraclitus, as referenced by Plato

At this time last year, I had just moved on from Release Engineering to start managing the Sheriffs and the Developer Workflow teams. Shortly after the release of Firefox Quantum, I also inherited the Taskcluster team. The next few months were *ridiculously* busy as I tried to juggle the management responsibilities of three largely disparate groups.

By mid-January, it became clear that I could not, in fact, do it all. The Taskcluster group had the biggest ongoing need for management support, so that's where I chose to land. This sanity-preserving move also gave a colleague, Kim Moir, the chance to step into management of the Developer Workflow team.

Meet the Team

Let me start by introducing the Taskcluster team. We are:

Hassan Ali

Wander Lairson Costa

John Ford

Jonas Finnemann Jensen

Dustin Mitchell

Pete Moore

Eli Perelman

Brian Stack

We are an eclectic mix of curlers, snooker players, pinball enthusiasts, and much else besides. We also write and run continous integration (CI) software at scale.

What are we doing?

The part I understand is excellent, and so too is, I dare say, the part I do not understand...

-- Socrates, in reference to Heraclitus

One of the reasons why I love the Taskcluster team so much is that they have a real penchant for documentation. That includes their design and post-mortem processes. Previously, I had only managed others who were using Taskcluster...consumers of their services. The Taskcluster documentation made it really easy for me to plug-in quickly and help provide direction.

If you're curious about what Taskcluster is at a foundational level, you should start with the tutorial.

The Taskcluster team currently has three, big efforts in progress.

1. Redeployability

Many Taskcluster team members initially joined the team with the dream of building a true, open source CI solution. Dustin has a great post explaining the impetus behind redeployability. Here's the intro:

Taskcluster has always been open source: all of our code is on Github, and we get lots of contributions to the various repositories. Some of our libraries and other packages have seen some use outside of a Taskcluster context, too.

But today, Taskcluster is not a project that could practically be used outside of its single incarnation at Mozilla. For example, we hard-code the name taskcluster.net in a number of places, and we include our config in the source-code repositories. There’s no legal or contractual reason someone else could not run their own Taskcluster, but it would be difficult and almost certainly break next time we made a change.

The Mozilla incarnation is open to use by any Mozilla project, although our focus is obviously Firefox and Firefox-related products like Fennec. This was a practical decision: our priority is to migrate Firefox to Taskcluster, and that is an enormous project. Maintaining an abstract ability to deploy additional instances while working on this project was just too much work for a small team.

The good news is, the focus is now shifting. The migration from Buildbot to Taskcluster is nearly complete, and the remaining pieces are related to hardware deployment, largely by other teams. We are returning to work on something we’ve wanted to do for a long time: support redeployability.

We're a little further down that path than when he first wrote about it in January, but you can read more about our efforts to make Taskcluster more widely deployable in Dustin's blog.

2. Support for packet.net

packet.net provides some interesting services, like baremetal servers and access to ARM hardware, that other cloud providers are only starting to offer. Experiments with our existing emulator tests on the baremetal servers have shown incredible speed-ups in some cases. The promise of ARM hardware is particularly appealing for future mobile testing efforts.

Over the next few months, we plan to add support for packet.net to the Mozilla instance of Taskcluster. This lines up well with the efforts around redeployability, i.e. we need to be able to support different and/or multiple cloud providers anyway.

3. Keeping the lights on (KTLO)

While not particularly glamorous, maintenance is a fact of life for software engineers supporting code that in running in production. That said, we should actively work to minimize the amount of maintenance work we need to do.

One of the first things I did when I took over the Taskcluster team full-time was halt *all* new and ongoing work to focus on stability for the entire month of February. This was precipitated by a series of prolonged outages in January. We didn't have an established error budget at the time, but if we had, we would have completely blown through it.

Our focus on stability had many payoffs, including more robust deployment stories for many of our services, and a new IRC channel (#taskcluster-bots) full of deployment notices and monitoring alerts. We needed to put in this stability work to buy ourselves the time to work on redeployability.

What are we *not* doing?

With all the current work on redeployability, it's tempting to look ahead to when we can incorporate some of these improvements into the current Firefox CI setup. While we do plan to redeploy Firefox CI at some point this year to take advantage of these systemic improvements, it is not our focus...yet.

One of the other things I love about the Taskcluster team is that they are really good at supporting community contribution. If you're interested in learning more about Taskcluster or even getting your feet wet with some bugs, please drop by the #taskcluster channel on IRC and say Hi!

#Mozilla #Taskcluster #management

chrissiezullo

Colored in my Black Panther sketch from the other day, hope you guys like it 😎

•18+ Adults Only

Watch Anya Live on Cam

Anya is live and ready to show you everything. Watch her strip, dance, and perform exclusive shows just for you. Interact in real-time and make your fantasies come true.

✓ Live Streaming✓ Interactive Chat✓ Private Shows✓ HD Quality✓ Free Actions

Free to watch • No registration required • HD streaming

(Seven Lions)

#SoundCloud #music #Seven Lions

Experiments in productivity: the shared bug queue

Maybe you have this problem too

You manage or are part of a team that is responsible for a certain functional area of code. Everyone on the team is at different points in there career. Some people have only been there a few years, or maybe even only a few months, but they're hungry and eager to learn. Other team members have been around forever, and due to that longevity, they are go-to resources for the rest of your organization when someone needs help in that functional area. More-senior people get buried under a mountain of review requests, while those less-senior engineers who are eager to help and grow their reputation get table scraps.

This is the situation I walked into with the Developer Workflow team.

This was the first time that Mozilla had organized a majority (4) of build module peers in one group. There are still isolated build peers in other groups still, but we'll get to that in a bit.

With apologies to Ted, he's the elder statesman of the group, having once been the build module owner himself before handing that responsiblity off to Greg (gps), the current module owner. Ted has been around Mozilla for so long that he is a go-to resource for not only build system work but many other projects, e.g. crash analysis, he's been involved with. In his position as module owner, Greg bears the brunt of the current review workload for the build system. He needs to weigh-in on architectural decisions, but also receives a substantial number of drive-by requests simply because he is the module owner.

Chris Manchester and Mike Shal by contrast are relatively new build peers and would frequently end up reviewing patches for each other, but not a lot else. How could we more equitably share the review load between the team without creating more work for those engineers who were already oversubscribed?

Enter the shared bug queue

When I first came up with this idea, I thought that certainly this must have been tried at some point in the history of Mozilla. I was hoping to plug into an existing model in bugzilla, but alas, such a thing did not already exist. It took a few months of back-and-forth with our reisdent Bugmaster at Mozilla, Emma, to get something setup, but by early October, we had a shared queue in place.

How does it work?

We created a fictitious meta-user, [email protected]. Now whenever someone submits a patch to the Core::Build Config module in bugzilla, the suggested reviewer always defaults to that shared user. Everyone on the teams watches that user and pulls reviews from "their" queue.

That's it. No, really.

Well, okay, there's a little bit more process around it than that. One of the dangers of a shared queue is that since no specific person is being nagged for pending reviews, the queue could become a place where patches go to die. As with any defect tracking system, regular triage is critically important.

Is it working?

In short: yes, very much so.

Subjectively, it feels great. We've solved some tricky people problems with a pretty straightforward technical/process solution and that's amazing. From talking to all the build peers, they feel a new collective sense of ownership of the build module and the code passing through it. The more-senior people feel they have more time to concentrate on higher level issues or deeper reviews. The less-senior people are building their reputations, both among the build peers and outside the group to review requesters.

Numerically speaking, the absolute number of review requests for the Core::Build Config module is consistent since the adoption of the shared queue. The distribution of actual reviewers has changed a lot though. Greg and Ted still end up reviewing their share of escalated requests — it's still possible to assign reviews to specific people in this system — but Mike Shal and Chris have increased their review volume substantially. What's even more awesome is that the build peers who are *NOT* in the Developer Workflow team are also fully onboard, regularly pulling reviews off the shared queue. Kudos to Nick Alexander, Nathan Froyd, Ralph Giles, and Mike Hommey for also embracing this new system wholeheartedly.

The need for regular triage has also provided another area of growth for the less-senior build peers. Mike Shal and Chris Manchester have done a great job of keeping that queue empty and forcing the team to triage any backlog each week in our team meeting.

Teh Future

When we were about to set this up in October, I almost pulled the plug.

Over the next six months, Mozilla is planning to switch code review tools from mozreview/splinter to phabricator. Phabricator has more modern built-in tools like Herald that would have made setting up this shared queue a little easier, and that's why I paused...briefly

Phabricator will undoubtedly enable a host of quality-of-life improvements for developers when it is deployed, but I'm glad we didn't wait for the new system. Mozilla engineers are already getting accustomed to the new workflow and we're reaping the benefits *right now*.

#Mozilla #bugzilla #Developer Workflow #management

Welcome, Connor!

This is *not* our Connor.

This post is *ahem* several months overdue, but I'm happy to welcome Connor Sheehan to the team.

Connor was a two-time intern with the Mozilla release engineering team. In that capacity, he became well acquainted with some of the bottlenecks in our CI system. We've brought him onboard to assist gps with stabilizing and scaling our mercurial infrastructure.

Welcome, Connor!

#Mozilla #hg #vcs #newhire #hiring

Work Week Logistics, Revisited

I've written before about how to be productive when distributed teams get together and was anxious to try it out on my "new" (read: six-month-old) team, Developer Workflow. As mentioned in that previous post, we just had a work week in Mountain View, so here's a quick recap.

Process Improvements

We often optimize work week location around where the fewest people would need to travel to attend. While this does make things logistically easier, it also introduces imbalance. Some people will have traveled very far, while some people will be able to sleep in their own beds. Conversely, the local people may feel they need to go home every night in order to be with their partners/families/cats and may miss out on the informal bonding that can happen at group dinners and such.

We had originally intended to meet in San Francisco, but other conferences had jacked up hotel rates, so we decided to decamp to the Valley. I offered to have the SF residents book rooms to avoid the daily commute up and down the peninsula. They didn't all take me up on it, but it was an opportunity to put everyone on more equal footing.

Schedule-wise, I set things up so that we had our discussion and planning pieces in the morning each day while we were still fresh and caffeinated. After lunch, we would get down to hacking on code. Ted threw together a tracking tool to help visualize the Makefile burndown. Ted is also great at facilitating meetings, keeping us on track especially later in the week as we all started to fade.

Accomplishments

So what did we actually get done? Like the old adage about station wagon full of tapes, never underestimate the review bandwidth of 4 build peers hacking in a room together for an afternoon. We accomplished quite a bit during our time together.

Aside from the 2018 planning detailed in the previous post, we also met with mobile build peer Nick Alexander and planned how to handle mobile Makefiles. The mobile version of Firefox now builds with gradle, so it was important not to step on each others toes. Another huge proportion of the remaining Makefiles involve l10n. We figured out how to work-around l10n for now, i.e. don't break repacks, to get a tup build working, and we've setup a meeting with l10n team for Austin to discuss their plans for langpacks and a future that might not involve makefiles at all. The l10n stuff is hairy, and might be partially my fault (see previous comment re: cargo-culting), so thanks to my team for not shying away from it.

On a concrete level, Ted reports that we've removed 13 Makefiles and ~100 lines of other Makefile content in the past month, much of which happened over the past few weeks. Greg has also managed to remove big pieces of complexity from client.mk, assisted by reviews from Chris, Mike, Nick and other build peers. We're getting into the trickier bits now, but we're persevering.

All in all, a very successful work week with my "new" team. I continue to find subtle ways to make these get-togethers more effective.

#Mozilla #distributed #teamweek #workweek #logistics

Introducing The Developer Workflow Team

I've neglected to write about the *other* half of my team, not for any lack of desire to do so, but simply because the code sheriffing situation was taking up so much of my time. Now that the SoftVision contractors have gained the commit access required to be fully functional sheriffs, I feel that I can shift focus a bit.

Meet the team

The other half of my team consists of 4 Firefox build system peers. My team consists of:

Chris Manchester

Ted Mielczarek

Mike Shal

Greg Szorc

When the group was first established, we talked a lot about what we wanted to work on, what we needed to work on, and what we should be working on. Those discussions revealed the following common themes:

We have a focus on developers. Everything we work on is to help developers be more productive, and go more quickly.

We accomplish this through tooling to support better/faster workflows.

Some of these improvements can also assist in automation, but that isn't our primary focus, except where those improvements are also wins for developers, e.g. faster time to first feedback on commit.

We act as consultants/liaisons to many other groups that also touch the build system, e.g. Servo, WebRTC, NSS etc.

Based on that list of themes, we've adopted the moniker of "Developer Workflow." We are all build peers, yes, but to pigeon-hole ourselves as the build system group seemed short-sighted. Our unique position at the intersection of the build system, VCS, and other services meant that our scope needed to match what people expect of us anyway.

While new to me, Developer Workflow is a logical continuation of build system tiger team organized by David Burns in 2016. This is the same effort that yielded sea change improvements such as artifact builds and sccache.

In many ways, I feel extremely fortunate to be following on the heels of that work. During the previous year, all the members of my team formed the working relationships they would need to be more successful going forward. All the hard work for me as their manager was already done! ;)

What are we doing

We had our first, dedicated work week as a team last week in Mountain View. Aside from getting to know each other a little better, during the week we hashed out exactly what our team will be focused on next year, and made substantial progress towards bootstrapping those efforts.

Next year, we'll be tackling the following projects:

Finish the migration from Makefiles to moz.build files: A lot of important business logic resides in Makefiles for no good reason. As someone who has cargo-culted large portions of l10n Makefile logic during my tenure at Mozilla, I may be part of the problem.

Move build logic out of *.mk files: Greg recently announced his intent to remove client.mk, a foundational piece of code in the Mozilla recursive make build system that has existed since 1998. The other .mk files won't be far behind. Porting true build logic to moz.build files and removing non-build tasks to task-based scripts will make the build system infinitely more hackable, and will allow us to pursue performance gains in many different areas. For example, decoupled tests like package tests could be run asynchronously, getting results to developers more quickly.

Stand-up a tup build in automation: this is our big effort for the near-term. A tup build is not necessarily an end goal in-and-of itself — we may very well end up on bazel or something else eventually — but since the Mike Shal created tup, we control enough of the stack to make quick progress. It's a means of validating the Makefile migration.

Move our Linux builder in automation from Centos6 to Debian: This would move move us closer to deterministic builds, and has alignment with the TOR project, but requires we host our own package servers, CDN, etc. This would also make it easier for developers to reproduce automation builds locally. glandium has a proof-of-concept. We hope to dig into any binary compatibility issues next year.

Weening off mozharness for builds: mozharness was a good first step at putting automated build configuration information in the tree for developers. Now that functionality could be better encapsulated elsewhere, and largely hidden by mach. The ultimate goal would be to use the same workflow for developer builds and automation.

What are we *not* doing

It's important to be explicit about things we won't be tackling too, especially when it's been unclear historically or where there might be different expectations.

The biggest one to call out here is github integration. Many teams at Mozilla are using github for developing standalone projects or even parts of Firefox. While we've had some historical involvement here and will continue to consult as necessary, other teams are better positioned to drive this work.

We are also not currently exploring moving Windows builds to WSL. This is something we experimented with in Q3 this year, but build performance is still so slow that it doesn't warrant further action right now. We continue to follow the development of WSL and if Microsoft is able to fix filesystem performance, we may pick this back up.

#Mozilla #developer tools #build system #tup #workflow

•18+ Adults Only

Watch Anya Live on Cam

Anya is live and ready to show you everything. Watch her strip, dance, and perform exclusive shows just for you. Interact in real-time and make your fantasies come true.

✓ Live Streaming✓ Interactive Chat✓ Private Shows✓ HD Quality✓ Free Actions

Free to watch • No registration required • HD streaming

(Mr. Ours)

#SoundCloud #music #Mr. Ours #Hip Hop #rap #triphop #electro

(MisterWives)

#SoundCloud #music #MisterWives

(Spencer Brown)

#SoundCloud #music #Spencer Brown #spencer brown #spencerbrown #anjunabeats #anjunadeep

Code sheriffing @ Mozilla: Past, Present, and Future

In a github world, developers have certain baseline expectations about interacting with source code and the tooling around it. These expectations can color their choices about which projects to contribute to. If Mozilla wants to compete with other companies and open source projects for developer mindshare (and code), we need to evolve the way we develop and distribute software. Code sheriffing and its associated tooling is one piece of that puzzle.

I inherited the Mozilla code sheriff team back in April. I didn’t initially think anything needed to change with sheriffing at Mozilla. Things had been “fine” for a while, so why rock the boat?

By nature, I dug into the history of my new team when I inherited them. What follows is a brief retrospective of sheriffing at Mozilla, the changes we’re undergoing right now, and my vision for how it might change in the future.

Past

I’ve been at Mozilla long enough now to remember when developers themselves acted as code sheriffs. In the beginning, every developer at Mozilla (myself included) rotated through the position1. Some developers were quite conscientious about sheriffing, others never even realized it was their turn. There was no formal training. Not surprisingly, the results were…uneven.

As the number of developers and the volume of code increased, this model became untenable. Code sheriffing as a well-defined role didn’t exist at Mozilla until 2012, initially coming as a response to the staffing increase in the lead-up to Firefox 4. At the same time, Mozilla was moving away from a “strict” waterfall development model tied to Tinderbox. Our new buildbot-based approach to CI allowed us to land more code, more quickly. Dedicated sheriffs were needed to make sense of it all. Even then, in true Mozilla fashion, sheriffing was an activity that blurred the lines between community and staff. Some of the most dedicated code sheriffs we have ever had were/are volunteers.

Whether staff or community, code sheriffs became de facto stewards of code quality. They were responsible for daily merges, selecting changesets with the lowest number of intermittent failures that would be suitable for inclusion in Nightly releases. When things broke, the sheriffs were responsible for backing out code, and even closing the development trees if the situation became sufficiently dire.

With the opening of the Mozilla office in Taipei, and the associated re-tasking of two QA resources as code sheriffs in that office, Mozilla almost had around-the-clock (24/7) coverage for code sheriffing, provided no one ever got sick or took a vacation.

We persevered in this model for a few years, and our developers understandably became accustomed to the freedom it provided them. Developers could functionally land their code and not worry about the outcome: the code sheriffs would ping them if any follow-up action was required. Fire-and-forget, if you will.

Sadly, in June 2017 our last Taipei sheriff resigned, leaving us with a glaring hole in our coverage. Even with community assistance, there were 8-10 hours per day with *no* active sheriffing. This led to an increase in tree closing events as sheriffs often needed to determine the root cause for a failure that had many commits on top of it already. Complaints started coming in about delays in landing code, and also about classification errors, e.g. permanent failures wrongly triaged as intermittent due to the time pressures of working in this mode. People were not happy, least of all the sheriffs.

This is when I realized I needed to rethink how sheriffing at Mozilla should work.

Present

The knee-jerk reaction would have been to simply hire another sheriff in Taipei, but that still would have left us vulnerable to illness, vacation, and further employment changes. Luckily, another solution presented itself.

Mozilla has an established history of working with SoftVision. I enlisted their help myself a few years ago when I was working in releng to help address our buildduty problem. It came to my attention that SoftVision was creating a 24/7 support service, and I decided to give it a try. That’s where we are now.

The SoftVision sheriffing contractors started in late August. They have spent the last two months learning (and then practicing) how to classify automation failures. The harder piece is learning how to properly select mergeable changesets and perform backouts. Mozilla guards the kind of source control access required to perform these code sheriffing activities pretty closely; it’s not something we simply give away. The contractors are slowly building that trust the same as any other contributor would. We’re getting there though:

An important milestone today: a SoftVision sheriff backed out their first commit: https://t.co/tcBhp7N7qy #Mozilla

— Chris Cooper (@ccooper) October 19, 2017

Once the SoftVision sheriffs are fully up-to-speed, they will be available 24/7 to assist developers, and to further the Mozilla mission with the usual array of merges, backouts, uplifts, and tree closures.

Right now, we are relying on the magnanimity of the former sheriffs and community sheriffs to help bridge the gap while the contractors are training up. It’s true, sheriffs throughput is still not back to the level before we lost our sheriff in Taipei, but I can see the light at the end of the tunnel.

Future

How can I be sure that light isn’t a train? Well, that’s the trick, isn’t it?

In retrospect, it was naïve of me to think that sheriffing could have existed for any length of time the way it was. Sheriffs felt enormous pressure to work longer hours than they should have because the trees needed to stay open, and "if not them, then who?" The human toll on those performing the work. whether staff or volunteer, was simply too high.

Yes, for the near-future at least, the SoftVision contractors will continue to perform merges and backouts as required in the model to which we’ve become accustomed. That work is still very operational, hands-on, and prone to burnout, and that’s where I think the biggest opportunity for change will come going forward.

Mozilla currently has two integrations branches – mozilla-inbound and autoland – in addition to mozilla-central. This makes life much harder for sheriffs because they need to merge code three-ways between the different branches. When bad code gets merged around accidentally, we are almost forced to close the trees while we recover.

The obvious change is to simplify the process and remove one of the integration branches. This might actually be feasible in the near future. With the announcement of Mozilla’s adoption of phabricator, 99.9% of code should eventually be able to land directly in the autoland repo, allowing us to decommission the mozilla-inbound repo. Once we return to a single integration branch, developer workflows can be much more streamlined, and streamlined workflows are ideal targets for automation.

My ideal future developer workflow would be:

Developer writes patch.

Developer compiles patch locally.

Patch posted to phabricator, triggers try run automatically.

If try run passes, suitable patch reviewers are selected automatically.

After successful review, patch is landed automatically on the autoland branch.

Autoland gets merged to mozilla-central automatically for changesets below the noise threshold for failures.

There are no code sheriffs in that picture at all. That’s a good thing.2

There’s a gulf of tooling improvements between where we are and that potential future, but if Mozilla wants to keep increasing the pace of development and attracting the best developers, I think the tooling investment is one we need to make.

1. Hilariously, a version of that sheriffing calendar still exists, projecting sheriff duty off into the future for a bunch of developers who haven’t even been at Mozilla for years. ↩ 2. I’m not naïve enough to think we won’t need *any* sheriffs. Even Facebook’s model still needs some. ↩

#Mozilla #CI #code #sheriff

Shameless self (release) promotion: Firefox 53.0b1 from TaskCluster

You may recall two short months ago when we moved Linux and Android nightlies from buildbot to TaskCluster. Due to the train model, this put us (release engineering) on a clock: either we'd be ready to release a beta version of Firefox 53 for Linux and Android using release promotion in TaskCluster, or we'd need to hold back our work for at least the next cycle, causing uplift headaches galore.

I'm happy to report that we were able to successfully release Firefox 53.0b1 for Linux and Android from TaskCluster last week. This is impressive for 3 reasons:

Mac and Windows builds were still promoted from buildbot, so we were able to seamlessly integrate the artifacts of two different continuous integration (CI) platforms.

The process whereby nightly builds are generated has always been different from how we generate release builds. Firefox 53.0b1 represents the first time a beta build was generated using the same taskgraph we use for a nightly, thereby reducing the delta between CI builds and release builds. More work to be done here, for sure.

Nobody noticed. With all the changes under the hood, this may be the most impressive achievement of all.

A round of thanks to Aki, Johan, Kim, and Mihai who worked hard to get the pieces in place for Android, and a special shout-out to Rail who handled the Linux beta while also dealing with the uplift requirements for ESR52. Of course, thanks to everyone else who has helped with the migration thus far. All of that foundational work is starting to pay off.

Much more to do, but I look forward to updating you about Mac and Windows progress soon.

#Mozilla #releng #beta #Firefox

•18+ Adults Only

Watch Anya Live on Cam

Anya is live and ready to show you everything. Watch her strip, dance, and perform exclusive shows just for you. Interact in real-time and make your fantasies come true.

✓ Live Streaming✓ Interactive Chat✓ Private Shows✓ HD Quality✓ Free Actions

Free to watch • No registration required • HD streaming

RelEng & RelOps highlights - February 21, 2017

It's been a while. How are you?

Modernize infrastructure:

We finally closed the 9-year-old bug requesting that we redirect all HTTP traffic to hg.mozilla.org to HTTPS! Many thanks to everyone who helped ensure that automation and other tools continued to work normally. Not every day you get to close bugs that are older than my kids. https://bugzilla.mozilla.org/show_bug.cgi?id=450645

The new TreeStatus page (https://mozilla-releng.net/treestatus) was finally released by garbas with a proxy in place of old url.

Improve Release Pipeline:

Initial work on Uplift dashboard has been done by bastien and released to production by garbas. https://shipit.mozilla-releng.net/release-dashboard

Releng had a workweek in Toronto to plan how release promotion will work in a TaskCluster world. With the uplift for Firefox 52 rapidly approaching (see Release below), we came up with a multi-phase plan that should allow us to release the Linux and Android versions of Firefox 52 from TaskCluster, with the Mac and Windows versions still being created by buildbot.

Improve CI Pipeline:

Alin and Sebastian disabled Windows 10 tests on our CI. Windows 10 tests will be reappearing later this year once we move datacentres and acquire new hardware to support them. https://bugzilla.mozilla.org/show_bug.cgi?id=1330999

Andrei and Relops converted some Windows talos machines to run Linux64 to reduce wait times on this platform. https://bugzilla.mozilla.org/show_bug.cgi?id=1337452

There are some upcoming deadlines involving datacentre moves that, while not currently looming, are definitely focusing our efforts in the TaskCluster migration. As part of the aforementioned workweek, we targeted the next platform that needs to migrate, Mac OS X. We are currently breaking out the packaging and signing steps for Mac so that they can be done on Linux. That work can then be re-used for l10n repacks *and* release promotion.

Operational:

Since most of our Linux64 builds and tests have migrated to TaskCluster, Alin was able to shut down many of our Linux buildbot masters. This will reduce our monthly AWS bill and the complexity of our operational environment. https://bugzilla.mozilla.org/show_bug.cgi?id=1335435

Hal ran our first “hard close” Tree Closing Window (TCW) in quite a while on Saturday, February 11 (https://bugzilla.mozilla.org/show_bug.cgi?id=1324148). It ran about an hour longer than planned due to some strange interactions deep in the back end, which is why it was a "hard close." The issue may be related to occasional "database glitches" we have seen in the past. This time IT got some data, and have raised a case with our load balancer vendor.

Release:

We are deep in the beta cycle for Firefox 52, with beta 8 coming out this week. Firefox 52 is an important milestone release because it signals the start of another ESR cycle.

See you again soon!

#Mozilla #releng #highlights

Being productive when distributed teams get together, take 2

Every year, hundreds of release engineers swim upstream because they're built that way.

Last week, we (Mozilla release engineering) had a workweek in Toronto to jumpstart progress on the TaskCluster (TC) migration. After the success of our previous workweek for release promotion, we were anxious to try the same format once again and see if we could realize any improvements.

Prior preparation prevents panic

We followed all of the recommendations in the Logistics section of Jordan's post to great success.

Keeping developers fed & watered is an integral part of any workweek. If you ever want to burn a lot of karma, try building consensus between 10+ hungry software developers about where to eat tonight, and then finding a venue that will accommodate you all. Never again; plan that shit in advance. Another upshot of advance planning is that you can also often go to nicer places that cost the same or less. Someone on your team is a (closet) foodie, or is at least a local. If it's not you, ask that person to help you with the planning.

What stage are you at?

The workweek in Vancouver benefitted from two things:

A week of planning at the All-Hands in Orlando the month before; and,

Rail flying out to Vancouver a week early to organize much of the work to be done.

For this workweek, it turned out we were still at the planning stage, but that's totally fine! Never underestimate the power of getting people on the same page. Yes, we did do *some* hacking during the week. Frankly, I think it's easier to do the hacking bit remotely, but nothing beats a bunch of engineers in a room in front of a whiteboard for planning purposes. As a very distributed team, we rarely have that luxury.

Go with it

...which brings me to my final observation. Because we are a very distributed team, opportunities to collaborate in person are infrequent at best. When you do manage to get a bunch of people together in the same room, you really do need to go with discussions and digressions as they develop.

This is not to say that you shouldn't facilitate those discussions, timeboxing them as necessary. If I have one nit to pick with Jordan's post it's that the "Operations" role would be better described as a facilitator. As a people manager for many years now, this is second-nature to me, but having someone who understands the problem space enough to know "when to say when" and keep people on track is key to getting the most out of your time together.

By and large, everything worked out well in Toronto. It feels like we have a really solid format for workweeks going forward.

#Mozilla #releng #workweek

Trending Blogs

Last Seen Blogs

Five different types of fried cheese