Blog by Piotr Banaszkiewicz

AMY release v1.5.4

Yesterday AMY v1.5.4 was released with a bunch of interesting changes.

New features

  • AMY is now capable of going through all active workshops and checking if their metadata (slug, start/end date, instructors and helpers) had changed. If so, a notification would be shown to the person associated with the event.
  • Aditya Narayan improved history log by enabling it to show related objects’ real names instead of IDs.
  • Greg Wilson added a button to mail everyone involved in a workshop
  • as part of his GSoC 2016 project, Chris Medrela added the trainings dashboard in its first shape
  • Chris Medrela in collab with Greg Wilson added SWC/DC instructor badge indicators to: all persons, event details, and “find instructors” views
  • Finally, I upgraded the “Find instructors” view to enable admins to search for not only instructors, but also in-progress instructor trainees and people who once had been associated with the workshop organization. Therefore the new name for “Finding instructors” is now “Find Workshop Staff”.

Bug fixes

  • Aditya Narayan fixed permissions issue when accessing event details page by people without permission to add ToDo items.
  • I fixed a small error preventing DataCarpentry logo from showing up on DC workshop request page.
  • I fixed a small error doubling people with both superuser and admin group permissions in the admin lookup backend.
  • Even smaller error was pointing admins to use wrong URL in import event template. It is now fixed.
  • Chris Medrela fixed the former “debriefing” view (now “instructors by date”) errors concerning emails generation when some people’s emails were unavailable.
  • Chris Medrela fixed I think the oldest unnoticed issue ever: wrong link generated for airport’s IATA code.
  • Finally, Chris Medrela fixed missing template from one of the new features for this release.

Summary

I’d like to thank Chris and Aditya for their continuous work on AMY. This is just the beginning, and if you’re curious go check out what’s planned for v1.6 (probably the next release). There’s a lot happening around AMY recently, so stay tuned for the next release.

AMY release v1.5.3

A minor version v1.5.3 of AMY was released two weeks ago and I failed to post a proper release notes since then so here it comes.

New features

Now it’s easier to add person to the database if they already submitted a profile update request.

This is specifically useful for admins if they want to add one person and can contact them to get more details (like affiliation or airport).

Bug fixes

  • Aditya Narayan fixed Django template tags autoescaping on the revision page (ie. each page the changes log links to)
  • Aditya Narayan again fixed “Update from URL” functionality that didn’t update event’s URL in specific conditions.

Thanks a lot, Aditya!

AMY bugfix release v1.5.2 and bug postmortem

24 hours ago, Maneesha Sane noticed and informed on GitHub that one of the tags in AMY is missing.

This observation led to an investigation on the servers and eventually to fix for a critical bug that caused the data loss.

But before we jump into sysadmin work…

What is a “tag” in AMY terms?

A tag is label that we give to various workshops. For example, all Software Carpentry workshops will have SWC tag, and all Data Carpentry workshops will have DC tag.

There are more labels we use, and the one that went missing was DC.

One event can have between zero and all the tags we have in the system, which means it’s a many-to-many relationship between events and tags. This type of relationship requires additional intermediate table in the database.

Contents from that table were missing because they were removed with removal of the DC tag.

Investigation

I started the investigation by narrowing timespan where the event, that led to data loss, occurred.

Then I followed by reading WWW server access logs to find out what happened in that timespan in hope I could find the bug.

After narrowing list of suspects, I was able to reproduce the bug.

Finally I retrieved the lost data from the most recent backup that still had it.

Ways to remove tags from AMY

There’s no interface for removing tags other than Django’s auto-generated admin interface; only a couple of people have access to it.

So the data loss was either human error or it was caused by code bug. This conclusion helped me define what I should be looking for in the WWW server log.

Narrowing event occurence timespan

AMY’s being backed-up by multiple systems; I logged into each of them and run multiple SQL queries on different databases to find out which backup had the DC tag and was the newest.

It turned out that backup from 2016-04-06 17:00 UTC-4 was the most recent still with the DC tag.

In the meantime I was fighting timezone correction… Our backup systems are in different datacenters and were running on different timezones.

Reading access log

First thing I checked in the access log is if anyone was using the admin panel to remove the tag. Unfortunately this possibility was quickly ruled out; so the loss was caused by code bug.

However, after reading the log no action stood out.

Short, important side story: Software Carpentry website rebuilds every 30 minutes. Each rebuild is shown in the log by multiple requests to AMY’s API:

[…] GET /api/v1/export/badges.yaml => generated 69122 bytes […]
[…] GET /api/v1/export/instructors.yaml => generated 50488 bytes […]
[…] GET /api/v1/events/published.yaml?tag=SWC => generated 231344 bytes […]
[…] GET /api/v1/events/published.yaml?tag=DC => generated 17806 bytes […]
[…] GET /api/v1/events/published.yaml?tag=TTT => generated 7879 bytes […]

Website grabs published events tagged by SWC, DC and TTT tags. This sequence of requests repeats every 30 minutes.

After reading the log over and over I noticed that two consecutive calls to /api/v1/events/published.yaml?tag=DC yielded results of very different response lengths:

[…] [Wed Apr  6 15:01:11 2016] GET /api/v1/events/published.yaml?tag=DC => generated 17806 bytes […]
[... many more requests ...]
[…] [Wed Apr  6 15:30:10 2016] GET /api/v1/events/published.yaml?tag=DC => generated 261521 bytes […]

Apparently then the DC tag disappeared, the API started returning all the published events, no matter if they were tagged SWC or something else.

This was a clear indication that the DC tag disappeared between 15:01 and 15:30.

That timespan doesn’t look like 17:00. Timezones… programmer’s nightmare.

Suspects

There was some user activity in this 30-minutes long window and one thing caught my eye:

[…] POST /workshops/events/merge?event_a_0=2016-05-06-RDAP16-Atlanta&event_b_0=2016-05-06-asist […]

(The actual URL was slightly changed to remove unnecessary information.)

This was a call to event merge functionality: someone wanted to merge workshops 2016-05-06-RDAP16-Atlanta and 2016-05-06-asist.

Short side note: merge functionality allows user to use more advanced strategy for merge; one can select which properties (or fields) in the final event should be used from event A (2016-05-06-RDAP16-Atlanta in our example) and which should be from event B (2016-05-06-asist). Additionally in case of event’s tags it’s possible to combine them from both base events.

I started testing different strategies. I had a feeling that the bug had something to do with strategy for event tags. :)

Finally I reproduced the bug by using following strategy:

  • base event: 2016-05-06-RDAP16-Atlanta (event A)
  • tags: from event B.

Data retrieval

At that point I decided to retrieve the lost data using SQL import/export functionality from the optimal (newest & containing the lost data) backup found earlier.

Bug

The only code used in event merge functionality that would trigger accidental removal is:

            if value == 'obj_a' and manager != related_a:
                manager.all().delete()
                manager.add(*list(related_a.all()))

            elif value == 'obj_b' and manager != related_b:
                manager.all().delete()
                manager.add(*list(related_b.all()))

This code is used for substituting related objects (tags in our case). It works like this:

if some field’s strategy is to switch to objects from the other event, then remove all currently assigned objects and add objects from the other event’s field.

Translated into tags:

if user wants to use event 2016-05-06-RDAP16-Atlanta as base event, but keep tags from the other event (2016-05-06-asist) then remove current tags from base event and add tags from the other event.

See what’s going on here? Base event’s tags were removed instead of being unassigned.

In this section I’m going to talk about how relations work and if they can be unassigned instead of being removed.

For many-to-many relationships (e.g. multiple events can be assigned multiple tags) Django creates an intermediate table that stores assignments. In this case, unassigning event from tag is as simple as removing that stored assignment from the intermediate table.

For one-to-many relationships (e.g. multiple events can have the same organizer) there’s no need for additional table; storing the organizer looks like event.organizer = SomeOrganizer.

In case of the one-to-many relationships we can unassign the event from SomeOrganizer if and only if event.organizer field can store NULL value. If it cannot, then we have to remove the event.

So the bug existed because the case of unassignment was not taken into account – only removal of related objects was accounted for.

Fix: need to find out when we can unassign

Long story short: in Django only related manager with .clear method can unassign; if this method is not present then the only option is removal.

So fixed code looks like this (minus the comments):

            if value == 'obj_a' and manager != related_a:
                if hasattr(manager, 'clear'):
                    manager.clear()
                else:
                    manager.all().delete()
                manager.set(list(related_a.all()))

            elif value == 'obj_b' and manager != related_b:
                if hasattr(manager, 'clear'):
                    manager.clear()
                else:
                    manager.all().delete()
                manager.set(list(related_b.all()))

(Yes, it probably should use try - except block instead of hasattr; pull request’s welcome!)

Final words

All in all, I feel good about this bug; if anything, I’d like eliminate the errorneous timezone arithmetics.

Also all backup mechanics and logging worked really nice.

As a result of investigation described above, the bug and the solution to it last night I released AMY v1.5.2.

AMY release v1.5.1

Since my comeback to university for MSc, the development of AMY slowed down. This past month we had a number of submissions from prospect GSOC’16 students (yay!) and, for the first time, number of bugs fixed exceeded number of new enhancements.

Since the number of new features is small, I decided to release a minor version (v1.5.1).

Contributions by GSOC students

March 2016 held GSOC’16 applications period for students. We had a lot of students this year and we encouraged them to take a look at AMY and maybe fix something. This resulted in a number of good contributions.

New features

Starting with new features since there’s so few of them:

  • Greg Wilson extended the check_certificates.py command to additionally return events people participated in
  • Shubham Singh added “Notes” field to instructor profile update form
  • Shubham Singh added new tag “hackaton”
  • Greg Wilson removed command check_badges.py
  • I enabled autogeneration of user’s username after they’re added to the database
  • Greg Wilson added link to Privacy Policy in the footer.

Bug fixes

  • Nikhil Verma found and fixed “List duplicates” page error when no duplicates existed
  • Chris Medrela found and fixed 404 page for revision that didn’t exist
  • Greg Wilson fixed NameError in check_certificates.py
  • I fixed a 500 error appearing when user submitted incomplete form used for matching people’s names
  • Maneesha Sane fixed minimal number of instructors required in our workshop request form
  • I fixed API renderers (CSV, YAML) not iterating generators but displaying their textual representations
  • I fixed instructors-num-taught report to include people’s names
  • I fixed a small typo in the name of post-workshop survey for instructors (it was called “pre”)
  • I made the emails case-insensitive
  • I fixed some 500 errors related to event importing.

AMY release v1.5

Development of AMY in February had seen a boost due to my winter break (I graduated university and had about a month of free time before MSc studies started), and that ended with today’s release of v1.5.

AMY's development chart AMY’s development chart shows a peak in February 2016. (click to enlarge)

Experiment

This time I decided to carry out a small experiment: I broke the month-long release cycle by deploying a v1.5.0-dev around February 15th.

There were some big changes already in the develop so I was hoping to get feedback on them and fix any issues in time for March.

Unfortunately I don’t think anyone used them yet, since I hear nothing about them.

Bug fixes

The number of bug-fixes for this release is higher than for v1.4, but it’s still considerably smaller than number of features :-)

Here are fixed bugs:

  • Django Rest Framework erroring URLs were removed (still not sure what caused them to error-out)
  • dashboard and workshop issues now show only active (== not stalled, not marked as complete) events
  • workshop issues was extended by providing a list of workshops without any assigned instructors
  • a rare error when looking someone up was fixed
  • API throttle rates have been increased
  • current and upcoming events on the dashboard are now based off of published events.

New features

The biggest new features for this release:

  • new workshops requests once accepted are linked to resulting events
  • admins now can submit invoices from AMY
  • admins can now receive event submissions (this should work really well for self-organized workshops that already have a workshop page)

Other changes:

  • badge details view allows for filtering
  • development and production software was updated
  • production assets (JavaScript and CSS files) are now compressed and served with unique name
  • persons merging was reworked and is now a lot better
  • it’s possible to find duplicates in the database now
  • base templates were renamed to lower confusion
  • API returns award date for every person for every badge
  • added CSV renderer to some API endpoint for exporting members
  • debrief was renamed and also allows for CSV export
  • Award model gained awarded_by field pointing to the person responsible for awarding a badge
  • person lookup in some places now works for “Name Lastname” pattern too.

Changes contributed by Greg Wilson:

  • CSV export of instructor completion rates
  • CSV export of missing instructor certificates

(Yes, Greg’s responsible for training new instructors.)

Next release

This was the last month of prof. Ethan White’s support of AMY’s development and I’d like to thank him a lot for the almost half-year long generous support. It definitely helped me work on AMY.

Since I’ve just started Master’s programme, and it’s very challenging, I’ll be slowing down the development pace. That’s also the reason why v1.6 doesn’t have a deadline (yet).

In the queue there are some very interesting changes to be made. To name a few:

  • generating certificates for people from AMY
  • checking-in workshop attendees from AMY
  • having a cronjob look over all workshop pages to find out if they were updated

AMY is probably also facing a database migration to PostgreSQL at one point (but rather sooner than later).

AMY release v1.4

Intense month of January is almost behind us, so it’s time for new AMY release.

Three versions have been released since v1.3: v1.3.1, v1.3.2 and finally v1.4.

Graduation

First, let me brag a little: on January 22 I graduated from my university, and have the Polish official profession title of engineer. You can imagine I’m like this all the time:

I'm engineer! All the time. (click to enlarge)

Release v1.3.1

One bugfix: don’t break whole timeline widget when there are TODOs without due date.

Release v1.3.2

New feature: stop using dots (.) for usernames, use underscores (_) instead.

This was an interesting issue: since we rely on some Ruby software on the SWC website, we can’t have dots in filenames (they’re treated as parameter access operator, for example: banaszkiewicz.piotr is piotr parameter on banaszkiewicz object). But we have filenames that correspond to usernames in AMY. So it was necessary to drop dots and switch to underscores…

Unfortunately, due to the way we have our project laid out on GitHub, some of the features implemented for v1.4 before this feature were included in the deployment; I will still put them to v1.4 section, though.

Release v1.4

The biggest highlights of this month are definitely:

  • first approach to the new API
  • API reports
  • merging events.

There were also some essential features, but not much. In v1.5 there will be a lot more.

Data fixing

We had to programmatically fix/complete some of our records:

  • historical events on production server were assigned an administering organization (that’s the one responsible for taking care of the workshop bureaucracy),
  • new DC instructors were added: anyone with a special note or anyone who taught at DC workshop now has a DC instructor badge.

Bug fixes

Looking at the list of issues for this release, it seems like many bugs were fixed. It’s true, however the bugs themselves weren’t that big:

  • some fields containing numerical values were switched to other type of fields to prevent slider from appearing; the background for this issue was that when scrolling through a page with form, on MacOSX people would accidentally change values of numerical fields,
  • generation of initial revisions was added to the process of creating a fake database for development use,
  • some types of events (stalled and unresponsive) were kicked out from debrief lookup,
  • some invoice options were changed to remain consistent with the rest.

New features

As usual, we hit a fair number of new features for AMY:

  • Person model is now able to store person’s occupation and ORCID code,
  • events can hold links to survey results (pre-workshop for learners and for instructors, post-workshop for learners and for instructors, and long-term for learners),
  • API call for getting members list is now for logged in users only, and returns members’ usernames too,
  • merging events: with option to select fields from either of events, or (in some cases) even to combine them together. The underlying code may be reused to fix persons merging,
  • workshop issues page now allows to filter workshops by assigned admin
  • move most of reports to the API; 3 reports now present a graph for easy use, 1 report was requested to be moved to API, and 1 new report was requested (and I made it in API),
  • API: new structure. It’s using hyperlinks between resources and allows to view and filter for example people associated with specific events,
  • slow tests were fixed (we gained probably around 10s on whole test suite, even though about 10-20 new tests were introduced); now it’s time to speed up migrations,
  • Greg added two new badges to the database: maintainer and trainer; I made sure to allow for editing badges via Django Admin interface, and also added these new badges to the fake database command,
  • Greg also added a new command for getting list of people who should be warned because their instructor training was about to close,
  • meanwhile I added a command for displaying report about instructor training completion rates.

Next release

I want to thank prof. Ethan White for his support to AMY development in January.

The next release may be last one made on such regular basis. The reason for this is that in March I start a new academic year (Masters!) and I know it will be very hard; what I don’t know is if I have time to work on AMY this much as in previous months.

Therefore there are multiple important features we want to implement in the v1.5 release – look for the “essential” issues.

AMY release v1.3

Welcome to 2016!

In the past month we’ve seen two releases of AMY: v1.2.1 and v1.3. This blog post contains a joined release notes for both of them.

Bug fixes

  • wrong URL used in event validation or import/update features is now indicated (and we won’t receive wrong notifications about it)
  • properly throw 404 on some pages (previously: 500)
  • spaces are striped from Person and ProfileUpdateRequest fields (names, emails)
  • disable location inputs on event details page if online country was preselected

New features

  • use custom-built jQuery-UI (so that we no longer have conflicts with Bootstrap’s tooltip module)
  • Greg updated the script used to send instructors “Hey, update your info” mails (it’s getting removed later on)
  • it’s possible to add memberships per host
  • new badge: DC instructor
  • new logic for dealing with two instructor badges
  • timeline of TO-DO items
  • basic models (e.g. lessons, tags, academic levels, etc.) are now manageable from Django’s admin interface
  • all persons view: add filtering by workshop type person taught at
  • remove blurred production database in favor of generated fake database
  • mailing script turned into better Django management command
  • bulk upload now shows generated username and suggested people with matching names
  • show preview of event on SWC website
  • API: filter events by tags

No longer with us

  • Greg removed some unused scripts (test-command-line-upload.sh) and commands (parse-instructor-info.py)
  • notifications about invalid HTTP header Host
  • other removed scripts and commands

Next release

January and February don’t seem busy for me, so I hope to have more done on AMY in the coming months. I also want to thank prof. Ethan White for:

  • continuing his support of my work through December
  • extending his support for the next two months!!!

See what’s scheduled for v1.4: https://github.com/swcarpentry/amy/milestones/v1.4

Automation Control and Robotics

I’m studying Automation Control and Robotics, a major that doesn’t say clearly what a person would do after graduating it.

From time to time I talk with people who either think I’m studying programming, or that I’m going to build robots in the future.

What is Automation Control and Robotics?

Most people don’t know what automation control is, so they focus on the part that sounds familiar: robotics. They automatically assume I’ll be building robots like these, or these.

Well, I won’t.

My studies concentrate on things like control systems (think of it as SCADA), optimization methods, control theory, electronics, specialized electronics (FPGAs, embedded systems, assembler programming, industrial-class robots), leverage of stochastic methods in industrial process identification, computer vision, PLC programming, and others.

As you can see, I was mostly balancing between engineering and very specialized computer science. There really was very little robotics during my Bachelor’s studies.

What can you do after graduating?

I like to call it: engineering.

People graduating automatics control and robotics are vast-minded, and ready to work in pretty much any engineering field that has something to do with programming:

  • we can set up wind-turbines
  • or air control systems
  • or nitrogen refill systems
  • or fine-tuning of power plants
  • or building assembly lines

There are thousands of options, all different kinds of industries.

Do I enjoy these studies?

Contrary to many of my friends, I do enjoy studying automation control and robotics. I learned a lot of engineering- and maths-related subjects, and I have hopes for a great work in future.

AMY release v1.2

This is the third post-gsoc release of AMY, the workshop management tool for Software-Carpentry. This release was supported by prof. Ethan White. Thanks a lot!

Below you can find release notes for version 1.2.

Bug fixes

This release contains bigger bug fixes than 1.1:

  • wrong object type passed to "".format() method
  • wrong characters permitted in event slugs
  • (big one) inconsistant logic in EventQuerySet.(un)published_events methods

New features

Following feature is also a bug fix, but it’s mostly a new feature so I included it here:

  • (big one) fix finding, parsing and validating of event tags: it will now work with <meta> tags on workshop websites

Other new features:

  • after approving person’s profile update request, the updated profile is displayed instead of the list of other update requests.
  • password reset form was added,
  • issues related to, for example, missing location data are now highlighted on event details page,
  • (big one) admins can now be assigned to specific events or event requests,
  • Greg changed descriptions section names on “instructor issues” page,
  • the same page was updated by me so that we can have pending and “gave-up-on” trainees listed,
  • the previous feature was introduced thanks to new ‘stalled’ tag,
  • our API gained some filtering (go to https://amy.software-carpentry.org/api/v1/events/published/, click “OPTIONS” and look at “query_params”; these can be added to the API calls, for example: https://amy.software-carpentry.org/api/v1/events/published/?host=123456789&administrator=987654321)
  • the same API gained a new endpoint used for generating list of current members of Software-Carpentry Foundation; this is in no way official list of members, but it can be used to help determine who’s eligible (credits for this one go to both Greg and me since I finished his pull request),
  • it’s now possible to search in events’ URL, contact, venue and address fields,
  • 2 new options for invoice status were added (not invoiced for historical reasons and not invoiced because of membership),
  • more places (workshop issues, and on each workshop without attendance data) to send “Give us attendance figures” emails, more people to send to,
  • profile update requests can now be edited by admins.

Next release

There’s number of issues scheduled for v1.3 release, and there will be others added to that list. The problem is that December contains:

  • end of my semester,
  • one huge exam, a couple of smaller tests,
  • Christmas,
  • deadline for my BEng. project and thesis.

So total time spent on AMY in December probably will be lower than what I did in November.