Tuesday, August 02, 2016

Usability improvements for Firefox automation initiative - Status update #2

On this update, we will look at the progress made since our initial update.

A reminder that this quarter’s main focus is on:
  • Debugging tests on interactive workers (only Linux on TaskCluster)
  • Improve end to end times on Try (Thunder Try project)

For all bugs and priorities you can check out the project management page for it:

Debugging tests on interactive workers

Accomplished recently:
  • Bug 1285582 - Fixed Xvfb startup issue
  • Bug 1288827 - Improved mochitest UX (no longer need --appname, paths normalized)
  • Bug 1289879 - Uses mozharness venv if available

Upcoming:
  • Support for smaller test harnesses (Cpp, Mn, wpt, etc)
  • Improved one-click-loaner UX


Thunder Try - Improve end to end times on try

Project #1 - Artifact builds on automation

No news for this edition and probably the next one.


Project #2 - S3 Cloud Compiler Cache

Accomplished recently:
  • Working on testing sccache re-write on Try
  • More news on following update


Project #3 - Metrics

Accomplished recently:
  • Bug 1242017 - Metrics team will configure ingestion point into Telemetry

Upcoming:
  • Bug 1258861 - Working on underlying data model at the moment:


Other
  • Bug 1287604 - Experiment with different AWS instance types for TC linux64 builds
    • Some initial experiments have shown we can shave 20 minutes off an average linux64 build by using more powerful AWS instances, with a reasonable cost tradeoff. We’ll start the work of migrating to these new instances soon.
  • Bug 1272083 - Downloading and unzipping should be performed as data is received
  • Bug 1286336 - Improve interaction of automation with version control
    • Buildbot AMIs now seeded with mozilla-unified repo (Bug 1232442)
    • TaskCluster decision and various lint/test tasks now use `hg robustcheckout` and share caches more optimally (Bug 1247168)
      • Flake8 tasks now complete in as little as 9s (~3m before)
      • Decision tasks now complete in <60s average="" font="" on="">
    • Some TaskCluster tasks now share VCS checkouts on Try (Bug 1289643)
      • Tasks will complete faster on Try due to not having to perform full VCS checkout

[3] https://bugzilla.mozilla.org/show_bug.cgi?id=1247168


Creative Commons License
This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

Thursday, July 21, 2016

Mozci and pulse actions contributions opportunities

We've recently finished a season of feature development adding TaskCluster support to add new jobs to Treeherder on pulse_actions.

I'm now looking at what optimizations or features are left to complete. If you would like to contribute feel free to let me know.

Here's some highligthed work (based on pulse_action issues and bugs):
This will help us save money in Heroku since using Buildapi + buildjson files is memory hungry and requires us to use bigger Heroku nodes.
This is important to help us change the behaviour of the Heroku app without having to commit any code. I've used this in the past to modify the logging level when debugging an issue.

This is also useful if we want to have different pipelines in Heroku. 
Having Heroku pipelines help us to test different versions of the software.
This is useful if we want to have a version running from 'master' against the staging version of Treeherder.
It would also help contributors to have a version of their pull requests running live.
We don't have any tests running. We need to determine how to run a minimum set of tests to have some confident in the product.

This needs integration tests of Pulse messages.
The comment is the bug is rather accurate and it shows that there are many small things that need fixing.
Manual backfilling uses Buildapi to schedule jobs. If we switched to scheduling via TaskCluster/Buildbot-bridge we would get better results since we can guarantee proper scheduling of a build + associated dependent jobs. Buildapi does not give us this guarantee. This is mainly useful when backfilling PGO test and talos jobs.

If instead you're interested on contributing to mozci you can have a look at the issues.


Creative Commons License
This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

Tuesday, July 19, 2016

Usability improvements for Firefox automation initiative - Status update #1

The developer survey conducted by Engineering Productivity last fall indicated that debugging test failures that are reported by automation is a significant frustration for many developers. In fact, it was the biggest deficit identified by the survey. As a result,
the Engineering Productivity Team (aka A-Team) is working on improving the user experience for debugging test failures in our continuous integration and speeding up the turnaround for Try server jobs.

This quarter’s main focus is on:
  • Debugging tests on interactive workers (only Linux on TaskCluster)
  • Improve end to end times on Try (Thunder Try project)

For all bugs and priorities you can check out the project management page for it:

In this email you will find the progress we’ve made recently. In future updates you will see a delta from this email.

PS = These status updates will be fortnightly


Debugging tests on interactive workers
Accomplished recently:
  • Landed support for running reftest and xpcshell via tests.zip
  • Many UX improvements to the interactive loaner workflow

Upcoming:
  • Make sure Xvfb is running so you can actually run the tests!
  • Mochitest support + all other harnesses


Thunder Try - Improve end to end times on try

Project #1 - Artifact builds on automation

Accomplished recently:
  • Landed prerequisites for Windows and OS X artifact builds on try.
  • Identified which tests should be skipped with artifact builds

Upcoming:
  • Provide a try syntax flag to trigger only artifact builds instead of full builds; starting with opt Linux 64.


Project #2 - S3 Cloud Compiler Cache

Accomplished recently:
  • Sccache’s Rust re-write has reached feature parity with Python’s sccache
  • Now testing sccache2 on Try

Upcoming:
  • We want to roll out a two-tier sccache for Try, which will enable it to benefit from cache objects from integration branches


Project #3 - Metrics

Accomplished recently:

Upcoming:
  • Putting Mozharness steps’ data inside Treeherder’s database for aggregate analysis

Other
Upcoming:
  • TaskCluster Linux builds are currently built using a mix of m3/r3/c3 2xlarge AWS instances, depending on pricing and availability. We’re going to be looking to assess the effects on build speeds of using more powerful AWS instances types, as one potential way of reducing e2e Try times.


Creative Commons License
This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.