Friday, April 22, 2016

The Joy of Automation

This post is to announce The Joy of Automation YouTube channel. In this channel you should be able to watch presentations about automation work by Mozilla's Platforms Operations. I hope more folks than me would like to share their videos in here.

This follows the idea that mconley started with The Joy of Coding and his livehacks.
At the moment there is only "Unscripted" videos of me hacking away. I hope one day to do live hacks but for now they're offline videos.

Mistakes I made in case any Platform Ops member wanting to contribute want to avoid:

  • Lower the music of the background music
  • Find a source of music without ads and with music that would not block certain countries from seeing it (e.g. Germany)
  • Do not record in .flv format since most video editing software do not handle it
  • Add an intro screen so you don't see me hiding OBS
  • Have multiple bugs to work on in case you get stuck in the first one



Creative Commons License
This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.

Sunday, April 17, 2016

Project definition: Give Treeherder the ability to schedule TaskCluster jobs

This is a project definition that I put up for GSoC 2016. This helps students to get started researching the project.

The main things I give in here are:

  • Background
    • Where we came from, where we are and we are heading towards
  • Goal
    • Use case for developers
  • Breakdown of components
    • Rather than all aspects being mixed and not logically separate

NOTE: This project has few parts that have risks and could change the implementation. It depends on close collaboration with dustin.


-----------------------------------
Mentor: armenzg 
IRC:   #ateam channel

Give Treeherder the ability to schedule TaskCluster jobs

This work will enable "adding new jobs" on Treeherder to work with pushes lacking TaskCluster jobs (our new continuous integration system).
Read this blog post to know how the project was built for Buildbot jobs (our old continous integration system).

The main work for this project is tracked in bug 1254325.

In order for this to work we need the following pieces:

A - Generate data source with all possible tasks

B - Teach Treeherder to use the artifact

  • This will require close collaboration with Treeherder engineers
  • This work can be done locally with a Treeherder instance
  • It can also be deployed to the “staging” version of Treeherder to do tests
  • Alternative mentors for this section is: camd

C - Teach pulse_actions to listen for requests from Treeherder

  • pulse_actions is a pulse listener of Treeherder actions
  • You can see pulse_actions’ workflow in here
  • Once part B is completed, we will be able to listen for messages requesting certain TaskCluster tasks to be scheduled and we will schedule those tasks on behalf of the user
  • RISK: Depending if the TaskCluster actions project is completed on time, we might instead make POST requests to an API

Project definition: SETA re-write

As an attempt to attract candidates to GSoC I wanted to make sure that the possible projects were achievable rather than lead them on a path of pain and struggle. It also helps me picture the order on which it makes more sense to accomplish.

It was also a good exercise for students to have to read and ask questions about what was not clear and give lots to read about the project.

I want to share this and another project definition in case it is useful for others.

----------------------------------
We want to rewrite SETA to be easy to deploy through Heroku and to support TaskCluster (our new continuous integration system) [0].

Please read carefully this document before starting to ask questions. There is high interest in this project and it is burdensome to have to re-explain it to every new prospective student.

Main mentor: armenzg (#ateam)
Co-mentor: jmaher (#ateam)

Please read jmaher’s blog post carefully [1] before reading anymore.

Now that you have read jmaher’s blog post, I will briefly go into some specifics.
SETA reduces the number of jobs that get scheduled on a developer’s push.
A job is every single letter you see on Treeherder. For every developer’s push there is a number of these jobs scheduled.
On every push, Buildbot [6] decides what to schedule depending on the data that it fetched from SETA [7].

The purpose of this project is two-fold:
  1. Write SETA as an independent project that is:
    1. maintainable
    2. more reliable
    3. automatically deployed through Heroku app
  2. Support TaskCluster, our new CI (continuous integration system)

NOTE: The current code of SETA [2] lives within a repository called ouija.

Ouija does the following for SETA:
  1. It has a cronjob which kicks in every 12 hours to scrape information about jobs from every push
  2. It takes the information about jobs (which it grabs from Treeherder) into a database

SETA then goes a queries the database to determine which jobs should be scheduled. SETA chooses jobs that are good at reporting issues introduced by developers. SETA has its own set of tables and adds the data there for quick reference.

Involved pieces for this project:
  1. Get familiar with deploying apps and using databases in Heroku
  2. Host SETA in Heroku instead of http://alertmanager.allizom.org/seta.html
  3. Teach SETA about TaskCluster
  4. Change the gecko decision task to reliably use SETA [5][6]
    1. If the SETA service is not available we should fall back to run all tasks/jobs
  5. Document how SETA works and auto-deployments of docs and Heroku
    1. Write automatically generated documentation
    2. Add auto-deployments to Heroku and readthedocs
  6. Add tests for SETA
    1. Add tox/travis support for tests and flake8
  7. Re-write SETA using ActiveData [3] instead of using data collected by Ouija
  8. Make the current CI (Buildbot) use the new SETA Heroku service
  9. Create SETA data for per test information instead of per job information (stretch goal)
    1. On Treeherder we have jobs that contain tests
    2. Tests re-order between those different chunks
    3. We want to run jobs at a per-directory level or per-manifest
  10. Add priorities into SETA data (stretch goal)
    1. Priority 1 gets every time
    2. Priority 2 gets triggered on Y push

[8] http://alertmanager.allizom.org/seta.html


Creative Commons License
This work by Zambrano Gasparnian, Armen is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License.