Blog by Piotr Banaszkiewicz

AMY release v1.8.0

Major AMY v1.8.0 release was tagged. As you can see below, it was definitely focused on fixing bugs.

New Features

  • Aditya provided a template change that displays link between closed workshop request and corresponding event.
  • Aditya hid survey-related fields on Event-related forms.
  • Chris sped up (again :-) ) tests.
  • Chris removed unnecessary help text for autocompletion fields.
  • Aditya refactored delete views to use DeleteViewContext, essentially making code more DRY and easy to change.
  • I added deleting entries from bulk-upload feature.
  • I updated DataCarpentry self-organized workshops registration form.

Bugfixes

  • Aditya changed uniqueness constraints on Sponsorship model to reflect recent changes he made on that model.
  • Aditya changed display of some Membership model fields.
  • Aditya added missing CSRF tokens in PyData import page.
  • Chris fixed a rare case of email address leakage (CC instead of BCC) in event details page, instructors by date and in workshop staff finder.
  • Aditya changed a uniqueness constraint on Task model + added some other small improvements.
  • Chris fixed non-working links and corrected ordering in all trainings page.
  • Aditya refactored internal URLs file to use nested URLs structure and therefore made it a lot more readable.
  • Chris made “progress” column in trainees view wider
  • Aditya hid from import instances that were decided not to be imported
  • I fixed error message on faulty bulk-upload process.
  • I fixed a double-display of unpublished and published views in very specific circumstances.
  • I stopped counting in unresponsive workshops in workshops issues page.

AMY bugfix release v1.7.2

AMY v1.7.2 was released today. It contains one bug fix provided by Aditya Narayan.

Aditya fixed a bug throwing 500 HTTP error when accessing /api/v1/todos/user/. This API endpoint is being accessed by the browser whenever any admin user loads their dashboard.

AMY releases v1.7 and v1.7.1

After another two weeks of development and two weeks of delays, we’re finally releasing AMY v1.7 and a bugfix v1.7.1. This post is a joint changelog for both of them.

Release v1.7

This release is especially interesting since:

  1. it includes mostly Aditya’s and Chris’ PRs
  2. it includes two big PRs containing the biggest part of Aditya’s and Chris’ Summer projects.

New features

  • Chris Medrela helped check for missing migrations in automated continuous integration service Travis-CI
  • Chris Medrela sped up Travis-CI checks of AMY’s test suite by using a cache directory
  • Aditya Narayan as part of his Summer work added titles and URLs to task objects in AMY (useful feature for PyData conference integration)
  • Aditya Narayan changed form for creating new events so that admins can assign themselves to a new event while creating it
  • Aditya Narayan added a Sponsorship model to AMY and integrated it with AMY (we can now track sponsors for events)
  • Aditya Narayan migrated Host to Organization: it fixed some naming inconsistencies
  • in v1.6 we dropped support for numerical event IDs to rely only on slugs (e.g. 2016-08-13-Krakow or 2017-01-xx-Boston), now Aditya Narayan cleaned some remains left in the code from before dropping the support
  • I added support for cancelled tag used to mark events supposed to happen but not happening eventually
  • Chris Medrela added instructor training workflow, ie. huge part of AMY used for instructor training
  • Aditya Narayan added a feature for importing people, events, tasks from PyData conference site in a comfortable way

Bug fixes

  • Chris Medrela tracked and fixed an error in part of AMY responsible for allowing users to log in with other credentials than user/password (currently: GitHub login)
  • I fixed an API error occuring in some views (endpoints) when using CSV or YAML return format
  • Chris Medrela added access to AMY for people in invoicing group
  • Chris Medrela replaced entity — with actual char
  • Aditya Narayan added a contact field on Sponsorship model
  • Chris Medrela fixed issue with user social integration with GitHub getting out of sync
  • I fixed JavaScript code responsible for generating dates (it was generating e.g. 2016-8-3, it’s now generating 2016-08-03)

Release v1.7.1

This release contains mostly bug fixes for features we added in v1.7 :-)

Bug fixes

  • Chris Medrela removed an overlooked debugging message alert in one of the views
  • Aditya Narayan added a cancel button to almost all the forms in AMY
  • I added a message to “Apply for Instructor Training” page saying that people cannot register for Fall 2016 open-access training anymore
  • Aditya Narayan fixed “Import from URL” not working on workshop acceptance page
  • Chris Medrela fixed some validation issue in one of training-related forms
  • Chris Medrela added access to admin dashboard in AMY to trainers

New features

  • Chris Medrela added a command line tool for importing trainees progress from previous data format into AMY

AMY release v1.6.2

Whoa, another one?! Yesterday we released v1.6.1, today it’s time for v1.6.2 with some very minor changes.

New features

  • New fields in the training request form:
    • group name will enable us to register groups for the training, without (for now) the need for a new form
    • comment will be a place for any additional information; instead of it, people would use additional skills.
  • Event.slug received new help text containing a format description for admins to use. This field’s validation was also changed so that it only allows entries in this specific format (this is additional to other validation done by Django, ie. only latin characters, digits, underscores and hyphens allowed).

Bug fixes

  • Migration 0088*, which was supposed to generate fake slugs for events without them, contained an error that we hit in the production, so I fixed it by adding random characters to the slugs if uniqueness constraint was about to be violated.

AMY release v1.6.1

We’re taking momentum! Two days after v1.6 release, we’re releasing a minor bug-fix version v1.6.1 which is not as small as you might think.

New features

  • Aditya Narayan changed the default value for invoice status field for events to “Not invoiced” (it was: “unknown”).
  • I added a link to the login form on the logout page. In future, we’re going to redirect to the login page with a message, but we’re waiting for Django to release a feature that will allow us to do this easily.
  • I restyled login page so that it’s clearer that people can use user+password OR GitHub account to log into AMY.

Bug fixes

  • Chris Medrela provided tests that make sure we don’t have bugs associated with saving M2M-related objects in an AutoProfileUpdateForm.
  • I added a link to the profile view page in the top navigation bar. This links to a trainee-dashboard page if current user is not an admin, and to a person-details page otherwise.
  • Chris Medrela fixed indentation of lists when they’re placed inside of tables.
  • Chris Medrela added clickable links in some help texts in the training request form.
  • Chris Medrela fixed wording in one field of the aforementioned form.
  • I added a missing migration (we commonly forget to add migrations when there are small changes introduced).

Other

  • Aditya Narayan changed some text fields in AMY’s models so that they cannot be equal to a NULL (or None) value. Instead an empty string is used for these fields’ default values. Some fields, especially ones with a uniqueness constraint, had to be left as nullable. In particular, this makes the Event.slug a required field.

AMY release v1.6

After (I think) 12 days of delay and 7 days of postponing, we finally closed and released AMY v1.6. It packs a whole lot of changes and bugfixes!

New features

  • I implemented a Data-Carpentry form for submitting requests for running self-organized workshops.
  • I added a histogram into frequency of instructors teaching report page.
  • Aditya Narayan added “Contact all” button on the all persons page.
  • Aditya Narayan continued W. Trevor King’s work on the Language model and now we can accurately track languages amongst multiple forms and related models (e.g. events and persons).
  • Aditya Narayan added a summary of tasks per role on person’s details page.
  • Chris Medrela added an application form for individuals wanting to become instructors.
  • I added Language support in additional forms (original PR was missing language support in some forms).
  • Big: Chris Medrela worked hard to bring GitHub authentication into AMY (with success!). There are some caveats, but we’ll smoothen them out for the next release. This work included opening AMY to other users (a move we were afraid of), and tests for each and every test to ensure we got the permissions right.
  • In the same PR, Chris Medrela added an AutoUpdateProfileForm used by users (who can log in from GitHub) to self-populate their profiles.
  • Aditya Narayan defined sorting of tasks on the person’s details page.

Bug fixes

  • I fixed a bug that caused IntegrityError when people with similar tasks (task has a role, person, and an event; tasks for these people were the same except person was different) were being merged. IntegrityError means that a uniqueness constraint was violated (ie. after the merge there were two Task(role, personA, event), which is prohibited).
  • Chris Medrela fixed interpolation on some of our charts that looked like the data was swinging, while in reality it wasn’t.
  • Aditya Narayan fixed default field values on the “All activities over time” page; now the fields have meaningful default values and the datetime inputs have a proper calendar widget.
  • Aditya Narayan reworked teaching frequency report to eliminate bug that duplicated numbers for people simultaneously marked as SWC and DC instructors.
  • I fixed some corner cases in event validation (behavior for required or optional tags/metadata (see below)).
  • I fixed a bug resulting in 500 Server error when accessing weblink to a non-existing Host.
  • Chris Medrela added one small migrations missing from the codebase.
  • Greg Wilson fixed a bug in API that prevented list from working on the generator objects for some renderers (CSV and Yaml).
  • Prerit Garg fixed a specific bug preventing saving a permissions form when person’s email field is empty.
  • Chris Medrela fixed a TrainingRequest form that display additional fields (that weren’t supposed to appear).

Other

  • Chris Medrela refactored “tags” to “metadata”; tags as key-value pairs describing workshops’ date, times, location, instructors and helpers. We changed the naming to “metadata” to not confuse with Tag model.
  • Chris Medrela sped up our tests by changing hashing algorithm to a slower one, which – surprisingly – is one of suggested test speedup suggestions by Django development team.

Summary

I’d like to thank Greg Wilson for supporting us throughout the exams, and even when we disappointed him by continuously not delivering and rescheduling this release. Greg, you’re awesome!

Finally I also want to point to Chris Medrela’s blog where he regularly posts AMY’s development during this Summer of Code.

AMY release v1.5.4

Yesterday AMY v1.5.4 was released with a bunch of interesting changes.

New features

  • AMY is now capable of going through all active workshops and checking if their metadata (slug, start/end date, instructors and helpers) had changed. If so, a notification would be shown to the person associated with the event.
  • Aditya Narayan improved history log by enabling it to show related objects’ real names instead of IDs.
  • Greg Wilson added a button to mail everyone involved in a workshop
  • as part of his GSoC 2016 project, Chris Medrela added the trainings dashboard in its first shape
  • Chris Medrela in collab with Greg Wilson added SWC/DC instructor badge indicators to: all persons, event details, and “find instructors” views
  • Finally, I upgraded the “Find instructors” view to enable admins to search for not only instructors, but also in-progress instructor trainees and people who once had been associated with the workshop organization. Therefore the new name for “Finding instructors” is now “Find Workshop Staff”.

Bug fixes

  • Aditya Narayan fixed permissions issue when accessing event details page by people without permission to add ToDo items.
  • I fixed a small error preventing DataCarpentry logo from showing up on DC workshop request page.
  • I fixed a small error doubling people with both superuser and admin group permissions in the admin lookup backend.
  • Even smaller error was pointing admins to use wrong URL in import event template. It is now fixed.
  • Chris Medrela fixed the former “debriefing” view (now “instructors by date”) errors concerning emails generation when some people’s emails were unavailable.
  • Chris Medrela fixed I think the oldest unnoticed issue ever: wrong link generated for airport’s IATA code.
  • Finally, Chris Medrela fixed missing template from one of the new features for this release.

Summary

I’d like to thank Chris and Aditya for their continuous work on AMY. This is just the beginning, and if you’re curious go check out what’s planned for v1.6 (probably the next release). There’s a lot happening around AMY recently, so stay tuned for the next release.

AMY release v1.5.3

A minor version v1.5.3 of AMY was released two weeks ago and I failed to post a proper release notes since then so here it comes.

New features

Now it’s easier to add person to the database if they already submitted a profile update request.

This is specifically useful for admins if they want to add one person and can contact them to get more details (like affiliation or airport).

Bug fixes

  • Aditya Narayan fixed Django template tags autoescaping on the revision page (ie. each page the changes log links to)
  • Aditya Narayan again fixed “Update from URL” functionality that didn’t update event’s URL in specific conditions.

Thanks a lot, Aditya!

AMY bugfix release v1.5.2 and bug postmortem

24 hours ago, Maneesha Sane noticed and informed on GitHub that one of the tags in AMY is missing.

This observation led to an investigation on the servers and eventually to fix for a critical bug that caused the data loss.

But before we jump into sysadmin work…

What is a “tag” in AMY terms?

A tag is label that we give to various workshops. For example, all Software Carpentry workshops will have SWC tag, and all Data Carpentry workshops will have DC tag.

There are more labels we use, and the one that went missing was DC.

One event can have between zero and all the tags we have in the system, which means it’s a many-to-many relationship between events and tags. This type of relationship requires additional intermediate table in the database.

Contents from that table were missing because they were removed with removal of the DC tag.

Investigation

I started the investigation by narrowing timespan where the event, that led to data loss, occurred.

Then I followed by reading WWW server access logs to find out what happened in that timespan in hope I could find the bug.

After narrowing list of suspects, I was able to reproduce the bug.

Finally I retrieved the lost data from the most recent backup that still had it.

Ways to remove tags from AMY

There’s no interface for removing tags other than Django’s auto-generated admin interface; only a couple of people have access to it.

So the data loss was either human error or it was caused by code bug. This conclusion helped me define what I should be looking for in the WWW server log.

Narrowing event occurence timespan

AMY’s being backed-up by multiple systems; I logged into each of them and run multiple SQL queries on different databases to find out which backup had the DC tag and was the newest.

It turned out that backup from 2016-04-06 17:00 UTC-4 was the most recent still with the DC tag.

In the meantime I was fighting timezone correction… Our backup systems are in different datacenters and were running on different timezones.

Reading access log

First thing I checked in the access log is if anyone was using the admin panel to remove the tag. Unfortunately this possibility was quickly ruled out; so the loss was caused by code bug.

However, after reading the log no action stood out.

Short, important side story: Software Carpentry website rebuilds every 30 minutes. Each rebuild is shown in the log by multiple requests to AMY’s API:

[…] GET /api/v1/export/badges.yaml => generated 69122 bytes […]
[…] GET /api/v1/export/instructors.yaml => generated 50488 bytes […]
[…] GET /api/v1/events/published.yaml?tag=SWC => generated 231344 bytes […]
[…] GET /api/v1/events/published.yaml?tag=DC => generated 17806 bytes […]
[…] GET /api/v1/events/published.yaml?tag=TTT => generated 7879 bytes […]

Website grabs published events tagged by SWC, DC and TTT tags. This sequence of requests repeats every 30 minutes.

After reading the log over and over I noticed that two consecutive calls to /api/v1/events/published.yaml?tag=DC yielded results of very different response lengths:

[…] [Wed Apr  6 15:01:11 2016] GET /api/v1/events/published.yaml?tag=DC => generated 17806 bytes […]
[... many more requests ...]
[…] [Wed Apr  6 15:30:10 2016] GET /api/v1/events/published.yaml?tag=DC => generated 261521 bytes […]

Apparently then the DC tag disappeared, the API started returning all the published events, no matter if they were tagged SWC or something else.

This was a clear indication that the DC tag disappeared between 15:01 and 15:30.

That timespan doesn’t look like 17:00. Timezones… programmer’s nightmare.

Suspects

There was some user activity in this 30-minutes long window and one thing caught my eye:

[…] POST /workshops/events/merge?event_a_0=2016-05-06-RDAP16-Atlanta&event_b_0=2016-05-06-asist […]

(The actual URL was slightly changed to remove unnecessary information.)

This was a call to event merge functionality: someone wanted to merge workshops 2016-05-06-RDAP16-Atlanta and 2016-05-06-asist.

Short side note: merge functionality allows user to use more advanced strategy for merge; one can select which properties (or fields) in the final event should be used from event A (2016-05-06-RDAP16-Atlanta in our example) and which should be from event B (2016-05-06-asist). Additionally in case of event’s tags it’s possible to combine them from both base events.

I started testing different strategies. I had a feeling that the bug had something to do with strategy for event tags. :)

Finally I reproduced the bug by using following strategy:

  • base event: 2016-05-06-RDAP16-Atlanta (event A)
  • tags: from event B.

Data retrieval

At that point I decided to retrieve the lost data using SQL import/export functionality from the optimal (newest & containing the lost data) backup found earlier.

Bug

The only code used in event merge functionality that would trigger accidental removal is:

            if value == 'obj_a' and manager != related_a:
                manager.all().delete()
                manager.add(*list(related_a.all()))

            elif value == 'obj_b' and manager != related_b:
                manager.all().delete()
                manager.add(*list(related_b.all()))

This code is used for substituting related objects (tags in our case). It works like this:

if some field’s strategy is to switch to objects from the other event, then remove all currently assigned objects and add objects from the other event’s field.

Translated into tags:

if user wants to use event 2016-05-06-RDAP16-Atlanta as base event, but keep tags from the other event (2016-05-06-asist) then remove current tags from base event and add tags from the other event.

See what’s going on here? Base event’s tags were removed instead of being unassigned.

In this section I’m going to talk about how relations work and if they can be unassigned instead of being removed.

For many-to-many relationships (e.g. multiple events can be assigned multiple tags) Django creates an intermediate table that stores assignments. In this case, unassigning event from tag is as simple as removing that stored assignment from the intermediate table.

For one-to-many relationships (e.g. multiple events can have the same organizer) there’s no need for additional table; storing the organizer looks like event.organizer = SomeOrganizer.

In case of the one-to-many relationships we can unassign the event from SomeOrganizer if and only if event.organizer field can store NULL value. If it cannot, then we have to remove the event.

So the bug existed because the case of unassignment was not taken into account – only removal of related objects was accounted for.

Fix: need to find out when we can unassign

Long story short: in Django only related manager with .clear method can unassign; if this method is not present then the only option is removal.

So fixed code looks like this (minus the comments):

            if value == 'obj_a' and manager != related_a:
                if hasattr(manager, 'clear'):
                    manager.clear()
                else:
                    manager.all().delete()
                manager.set(list(related_a.all()))

            elif value == 'obj_b' and manager != related_b:
                if hasattr(manager, 'clear'):
                    manager.clear()
                else:
                    manager.all().delete()
                manager.set(list(related_b.all()))

(Yes, it probably should use try - except block instead of hasattr; pull request’s welcome!)

Final words

All in all, I feel good about this bug; if anything, I’d like eliminate the errorneous timezone arithmetics.

Also all backup mechanics and logging worked really nice.

As a result of investigation described above, the bug and the solution to it last night I released AMY v1.5.2.

AMY release v1.5.1

Since my comeback to university for MSc, the development of AMY slowed down. This past month we had a number of submissions from prospect GSOC’16 students (yay!) and, for the first time, number of bugs fixed exceeded number of new enhancements.

Since the number of new features is small, I decided to release a minor version (v1.5.1).

Contributions by GSOC students

March 2016 held GSOC’16 applications period for students. We had a lot of students this year and we encouraged them to take a look at AMY and maybe fix something. This resulted in a number of good contributions.

New features

Starting with new features since there’s so few of them:

  • Greg Wilson extended the check_certificates.py command to additionally return events people participated in
  • Shubham Singh added “Notes” field to instructor profile update form
  • Shubham Singh added new tag “hackaton”
  • Greg Wilson removed command check_badges.py
  • I enabled autogeneration of user’s username after they’re added to the database
  • Greg Wilson added link to Privacy Policy in the footer.

Bug fixes

  • Nikhil Verma found and fixed “List duplicates” page error when no duplicates existed
  • Chris Medrela found and fixed 404 page for revision that didn’t exist
  • Greg Wilson fixed NameError in check_certificates.py
  • I fixed a 500 error appearing when user submitted incomplete form used for matching people’s names
  • Maneesha Sane fixed minimal number of instructors required in our workshop request form
  • I fixed API renderers (CSV, YAML) not iterating generators but displaying their textual representations
  • I fixed instructors-num-taught report to include people’s names
  • I fixed a small typo in the name of post-workshop survey for instructors (it was called “pre”)
  • I made the emails case-insensitive
  • I fixed some 500 errors related to event importing.