The Learning Systems development team at the OU has been planning a bit of a restructure for a while, and finally everything is in place and people are transitioning into new roles.

I have been given the honour of becoming the Development Architect in the new Learning, Media and Collaboration development team.  Don’t google it, its one of those odd OU job titles that doesn’t seem to match industry-known titles.  Maybe you can help me find a better name for it!

I’m still working out exactly what this means in practice,  but it seems to involve oversight of the architectural designs for our systems and they way they link together, and taking the lead on best practice in development, on how we engage with the open source community,  and on delivering quality products.  I guess this blog will need a re-brand!

I have a lot to learn about our media production and collaboration systems.  And I certainly feel that I’m standing on the shoulders of giants in the Moodle development world.  The idea of leading the people in this team is almost nonsensical.  The role is set up as a sort of “first among equals”.  It’s more about generating consensus than preaching from a pulpit.  In some ways this fits with a “servant leader” role, acting as a facilitator for the the team so they can continue to do great development.

Although I’ll be doing a whole lot less actual development in future, I hope to keep my hand in so that I don’t become one of those managers who doesn’t understand what their team actually does.  That’s not working so well so far – meetings 1 : coding 0.

The next big change for the development teams will be to make our processes even more agile.  At the moment, each lead developer runs a team of developers (ok sometimes of themselves) focusing on a particular aspect of our learning systems.  We are now merging teams into fewer agile scrums so that every-one has a chance to work on more of our development strands and so we can flex resource to meet our priorities better.  We’ve already started to merge our “todo” lists into a single prioritised portfolio backlog and estimate in story points, so in a few months we’ll have a better handle on team velocity.

I’m sure I’ll blog more about this later as we work out how this looks for us in practice.

I went to the #Design4Learning conference last week – yes OK, I’m a bit late with the write up, its been a hectic week or two!

The conference was held at the OU, which had the advantage that I could get home in the evening for my son’s birthday, but the disadvantage that I kept popping back into the office to keep on top of things.  It reminded me that while I hate the coffee, lunch, and conference dinner opportunities for informal contact (being an introvert by nature), there is something lacking if you don’t participate at all in them.

It was great to spend a couple of days thinking about what we’re developing, and why, rather than how.  I really value these opportunities to talk to academics.  The focus of the conference was around learning design and learning analytics.  I won’t write up all the sessions that I went to, but here are some of the highlights for me…

Sharon Slade presented about the OU’s ethics policy for learning analytics and the questions that students raised.  The OU is considered to be the first university with an ethics policy for learner analytics. There is a real challenge in working out how we communicate with students what we’re doing.  Students on one hand want a personalised experience, but they don’t want us “snooping”.  They’re happier with feedback based on trends rather than personal activity.  What is clear is that the results of analytics should focus us on what questions to raise with students, rather than making conclusions on what’s going to happen to them.

You can never really know why a student has a pattern of activity, and whether or not they’re likely to fail as a result.  Elsewhere in the conference (I forget when) I heard it said that students who post often in the forums and turn in their assignments on time are most likely to pass.  Back in the 90s when I was an OU student, I certainly handed in my assignments on time, but I never posted in the forums.  I was too shy, too lacking in confidence that I had anything of value to add, too unwilling to expose my own stupidity…  I wonder what the analytics would have said about me… But I passed with flying colours.  Similarly my son (who has learning difficulties), always fails tests, rarely hands in his homework, never raises his hand in class… but you only have to talk to him to know that he is learning. Obviously “learning” isn’t good enough for a university which needs students to pass qualifications to prove success, but we live by different rules in my house! Learning analytics done badly might suggest that a uni should give up on my son.  Done well it should suggest that he needs a lot of extra support to pass despite innate ability.

Simon Cross talked about his Open Education MOOC which formed a blocks of a formal OU course, where students and the  public learned together in the open. Apart from being pleased to see my old project, OpenLearn, being used for this, the thing that most interested me was that students had concerns over what they are paying for with their OU study (worrying perhaps that they didn’t value the assessment, tutor support etc), and that they wanted the badge as well as the TMA grade – and I thought badges were a gimmick!

There was a useful learning design tool from Imperial College London called BLENDT.  You plug in your learning outcomes and it helps you work out what sorts of (online) activities will help meet them. It is based on Blooms taxonomy where objectives are classified as psychomotor, affective or cognitive skills – users pick words in their objectives such as “explain”, “list”, or “discuss” and the system works out which of the skill sets these map to and then presents example activities that best suit meeting those objectives. It is customisable based on factors such as group size, to cater for the fact that some activities don’t work in some situations.  This tool aims to provide a discussion point for teachers to make final decisions on activity mix.  It looked like something that could be very helpful in supporting work to embed learning design in our every-day practice.  I wonder whether we could write a similar Moodle module?  Or maybe some-one already did?

Finally, one fascinating piece of software from Denise Whitelock at the OU’s Institute for Educational Technology was Open Essayist. which lets students upload draft essays to get feedback on structure.  The tool shows you where your key words and phrases are in the essay, helping you ensure you have good start/middle/end, spread of keywords across the essay, and clear connections between concepts.  Because it provides feedback on structure only, it should apply across disciplines.  It has been proven to improve grades, although obviously structure is only one component of grade, so you do have to say something useful and relevant and understand the assessment criteria for your subject to get top marks.  Denise has also found the tool useful for analysing MOOC comment threads and creating paper abstracts.  There was clear demand in the room from OU tutors for us to provide this to all students, especially at level 1.  I would have loved something like this when I first learned how to write essays.

This blog post feels a bit rambling.  I wonder what Open Essayist would make of it!  Anyway, I hope I’ve given you a taste of the conference and some of the ways that analytics may be changing the services we offer to students and the way we design learning in the future.  I hope my team get a chance to be involved in some of these development.

The lead developers have been experimenting with story points recently. I wrote a few weeks back about how I thought we might decide on the meaning of a point.

We decided that our scale would range from 1 to 377, so that the upper end catches the complex, major projects that we sometimes work on.

Then we decided that we’d start with 1 story point being things we’d traditionally have expected to take a quick, experienced developer about an hour.  Working that all the way up the scale, a 377 story point is the sort of thing we’d traditionally have estimated at 6 months or so.

Then we dug around our recent work and our future plans for things that fitted along the scale.  We put them all in a big pot and decided two or three for each level on the scale which were the most useful examples for the future for how to size work.  We came up with a few useful rules where you might want to move up a level because they make the work more complicated:

  • adding a lot of behat and/or unit testing;
  • lots of javascript and/or css;
  • working in an area that is already very complex; and
  • working with a third party community / system integration.

We’ve recently been given a single prioritised backlog from our Learning & Teaching colleagues, so we decided to try out our shiny new story points on their list.  This morning we played planning poker for the very first time.

We set up a spreadsheet with the business description, a supplementary techie speak description, and a column for each person.  Each person then had a couple of days to use the story point exemplars and enter their estimate.  Each person hid their column on completion so that no-one else’s estimate was biased.

To be fair, I found I mostly thought in days still and worked back to story points.  It’s going to take some time to be able to look at a description and think in points instead.

Today we worked out the median of every-one’s scores, picked the nearest point on the scale and decided what we wanted to announce as our final estimate.  I was pleasantly surprised by the amount of consensus we achieved. There were some items with wildly differing opinions but that usually just highlighted that the request was poorly defined still and we had differing assumptions on what we would do.

We decided upon one or two more “rules”:

  • If the range is very diverse because the item is ill-defined, we refuse to estimate until we have more detail.
  • If the median is the same as the estimate from the person who’s subject matter expert for the area of work, we go with that for our final estimate.
  • If the median is one step either side of the estimate from the subject matter expert, that person has final say to overrule or accept the median.
  • If the median is more than one step either side of the estimate from the subject matter expert, further discussion is required to clarify and agree a final estimate.  There were actually very few stories which fell into this category.

Using this approach we managed to get 6 people to estimate just under 100 stories in just under an hour.  That’s a lot quicker than any of us expected.  We ended up with an estimate of just over 2,500 story points for 3 months work by 8.4 people.  Gut instinct suggests that we might manage about 75% of that.  But since there were about a dozen things we couldn’t estimate, maybe that’ll be more like half in reality?

So with my project manager hat on, I now have to work out whether we should have estimated for the “top slice” tasks: support, bug fixing, advice, technical debt stories…  We probably should have, but I’m not sure if the same approach works. Especially for “keep the wheels on” server monitoring activity.

And now that I have a set of estimates for most of the things our business partners want, we have to communicate that back to them.  There will, presumably, be some iteration while we refine the requirements for the things we refused to estimate, attach names to tasks that no-one thought they were doing but which had high priorities, and drop some stuff off the bottom of the list.

I’m really interested to see what we actually deliver at the end of January, so we can for the first time get a feel for how what our development teams velocity might be. Remind me to tell you about that when the time comes!

Well not prizes in this case! No, I’m referring to story points. One of the next things on my to do list is to look into exemplar stories with points to give us a baseline to start from as we move to estimating development in points rather than hours.

In some ways this seems pretty easy:

1) define a 1 point story
2) define your scale
3) think of some things that fit at the higher story points.

One is OK. The smallest piece of Moodle work that I can think of is to update a language string, or change a style on a discrete screen element. These are the sort of thing that take a few minutes to do, plus a bit of testing. Maybe an hour in total. So that’s what I’m starting from as a single point.

Two is OK. We’ll use the common fibonacci approach, so our scale will be 1, 2, 3, 5, 8, 13, 21, 34. And that’s probably far enough. If one point starts off looking a bit like 1 hour, then 34 points is a little over a week’s work. And the theory says that if you’re estimating something as more than a week’s work, then its an epic not a story. We might quote for epics in bigger numbers – lets say 55, 89 and 144. At 144 story points, we’re talking about 4-5 weeks work and our ability to estimate accurately is probably quite poor. Then there would be 233, 377 and 610. Now we’re in the region of 5 months work which we would probably describe as a theme with even less accuracy. And there’s not a lot of point of counting points after this, so the next step is infinity or “nobody knows”.

But for my current focus, I want to categorise up to 34. And it is at step 3 that it starts to get a bit more challenging. Here are the kinds of things that I’ve been thinking about:

* add a configuration setting to control an existing feature
* write a block to display a message
* add a capability to control permission over an existing feature
* add a field to a form and save the result in the database
* add a log event on a specific page

I seem to be able to think of lots of really little things, and lots of really big things, but I’m rather struggling with the things in the middle.

So, do you use story points to estimate Moodle development work? If so, are you willing to share what yours look like?

Two weeks ago I posted that the OU Moodle development team were going to spend a fortnight writing Behat scripts.

I promised to write up how we got on, so this post fulfills that promise, I hope.

Progress

Our aims for the fortnight were

  1. to generate automated Behat tests for a significant proportion of OU plugins that we can use in future;
  2. to create the know-how within the team of how to create and run those tests; and
  3. to establish writing tests like this as part of what we do as part of any development

Of the 15 developers, 4 had written Behat scripts before and acted as mentors for the team.  By the end of the fortnight another 8 developers had got Behat setup and 7 had successfully written scripts.

The OU has over 200 plug-ins in our Moodle code base.  Before the sprint only a couple had any Behat coverage.  Now about 16% do.

We wrote over 3,000 lines of Behat-related code in the past fortnight, about 10% of which are for custom steps.

Our handover checklist now says that all new features should come with Behat scripts, unless there’s a really good reason not to.

People generally found the week useful both in terms of learning the same thing together so there were people able to help each other out and that Behat itself is beneficial to our work. Some found bugs and fixed them as they went along, so there has been a (small) positive impact on system quality.   All agreed that it is enjoyable to see the scripts working and that this sort of automation is what computers are for.  Hopefully we’ll use this approach when we need to learn something new again.

Obviously, that’s not the end of the story.  We had a retrospective and came up with some tips & tricks to share with each other and a few questions.  And we’ll carry on doing Behat and getting better coverage for our more of our plug-ins.

Tips and tricks:

  • The Site admin tree takes too long to expand sometimes so times out before the next script step which then fails.  Either restart the PC, or if that fails,  write an “I expand” step then “I navigate to”.
  • Sometimes when filling in forms, care is required over the enabling of other fields or buttons before moving to next step.  You may need to alter the order of the steps, or inject extra steps like check the button exists.
  • Cron only runs 1 per minute so if your script runs cron repeatedly with the “I trigger cron” step,  you may need a step to reset cron so that it will work.  This bug has been reported to HQ.  In any case if you have a cron task which doesn’t fire on every cron run, will need to set this up specifically in your script so that it your conditions are met.
  • It is possible to use PHPUnit generators as a way of doing some background setup, e.g. if you want to set up a number of activities with specific properties.  It is quicker to setup the course in this way than writing Behat steps to setup through the UI.
  • Lots of our plug-ins are interdependent, with if x is installed PHP so we can still share them individually.  If you want to test integration between plugins you can write a custom step to check if the plug-in installed and throw a SkipException if not installed.
  • Use names visible in the UI to identify screen elements rather than css ids.  This is a) more readable by humans, b) if your feature is built this way it is likely to be more accessible to screen-readers.
  • If you have to use a complex xpath to identify a screen element, add a comment to explain to human reader which bit of the page you’re trying to test.
  • If you including test files it is ok to put them in the fixtures folder but because they’re in the webroot they must only include publicly available materials.
  • If you are repeatedly writing the same collection of steps, you should convert these to a custom step.
  • Don’t forget to update the steps list occasionally to see what’s available.  Take care when relying on a custom step stored another plugin – can you be sure the plugin will be there (e.g. for contributed plugins)?  You may need to copy the custom step to your plug-in.
  • When writing custom steps, consider if they are Moodle generic and if so they should be given to core e.g. date picker.
  • Write scripts where you set up a feature as one user and then switch log in as an account with lower permissions to test actual output.

Area of concern:

The latest FF upgrade to 32 causes behat to fail.  In summary, don’t upgrade or use Chrome for a while.  This has also been reported today in the GDF forum.

There was some discussion over the difference between test scripts and Behat scripts – they should not be identical.  Test scripts should test more/different things to Behat or test the bits that cannot be Behat scripted easily.

 

 

Perhaps you have a view on some of these, or some better tips and tricks?  If so, please share them in the comments.

 

 

writing behat scripts!

Here at the OU’s learning systems development team we’ve been watching others work with Moodle and behat for some time.  I first saw it demo’d at the Perth Hackfest about 18 months ago.  We’ve experimented with automated testing with selenium and ranorex in the past, but with little success, so  we’ve been a little cautious about jumping on the behat bandwagon.

But the stars have aligned to change that.  We have recently updated our codebase to Moodle 2.7.x which heralds the usual spree of regression testing to make sure that nothing broke.  That, along with the fact that Moodle HQ are looking for behat scripts as part of integration testing now, prompted Sam Marshall to suggest that we should use this opportunity to write behat tests for our plug-ins.

It seems like a good idea to write the scripts to perform the regression testing that we’d otherwise be doing by hand.  Hopefully for 2.8.x then this process will be quicker!  So we decided to dedicate the entire team to a fortnight’s behat sprint.  People will be planning, writing and/or running automated tests all at the same time, so we can learn from each other as we go along.

Our aims for the fortnight are

  1. to generate automated behat tests for a significant proportion of OU plugins that we can use in future;
  2. to create the know-how within the team of how to create and run those tests; and
  3. to establish writing tests like this as part of what we do as part of any development.

How will we do?  Check back in a fortnight to find out!

 

As I mentioned in a previous post, I see software as a creative process, an art form.  So this is a post about the other creative art form that I enjoy, bobbin lace making … like this.

A small part of a bedfordshire lace edging, pattern by Christine Springett from an 18th C collar

A small part of a Bedfordshire lace edging, pattern by Christine Springett from an 18th C collar, completed 2014

One of the reasons my brain starts to fry when I’m asked about suitable metrics to measure the performance of a development team, is because I’m struggling to answer the question “how to you measure art?”  So, how do I measure my lace?

Well sometimes with a tape measure, sure … wedding garters tend to need to be about 40 inches to get the ruffles right.

But there are other ways too.  You could count the stitches but that would be a bit pointless.  We often count the number of bobbins used, and whether the threads “run” or have to be sewn together in separate sections; these are signs of the complexity of the piece.  And yes, I can count how long it takes.  A garter normally takes about 1.5 hours per inch, the piece above was worked slowly over about 2 years.  Of course, I have often an idea when I want something finished by –  I may give my lace away so the deadline is Christmas or a birthday – so hitting the target date (actual vs estimate) is sometimes meaningful.  And sometimes not – I did several other pieces in the 2 years on the piece above, because they had a higher priority.  Does that mean the edging is “bad”? No.

You can also count the number of times I rework parts of a pattern to get it right.  Or the number of defects left in the finished article.  And you can turn it over to look at the back, and see how neatly it is finished (like doing a code review).

The things that are most important to me are more subjective … Does it have “wow” factor?  Does it do what I (or the person receiving it) want it to do?  What do my lace-maker friends think of it? These fall more into the realm of user & peer feedback.

And finally, did I learn something?  My first attempt at Bedfordshire lace was not very good, but this piece is the culmination of 20+ years of practice.  I got better (neater, quicker, more complex, more wow) along the way.  Some learning pieces come off the pillow and languish in a cupboard because we are not proud of them.  But they were still worth it, because we learned.

When I start a piece, I know what’s going to be important.  Right now, I’m doing a nativity set for Christmas, a bit like this.  So, this is a simple lace that I’ve done many times before.  Nothing to learn.  Neatness and wow factor are important because they’ll be on display every year, as is durability.  Time is important if I want them on display this year.  The next piece on my backlog though is a first go at Flanders.  For that, time will be meaningless, neatness and wow are unlikely unless existing skills transfer readily, but learning is critical.

Does this teach us something about software metrics?  We can judge the complexity of a request, and we can do code & peer review and count defects after release.  These are all worthwhile in my view.  But I’d like to see us judge what is important at the start of the piece, and measure against that at the end.

So “add the permission control to do roll out reports to faculty” is something we’ve done many times before, nothing to learn, neatness and durability important, time critical.  But “make annotation work on PDFs and images” will be an innovation piece – a work of art. It has  no time deadline, will definitely provide wow and requires significant learning.  Does it really make sense to estimate how long creation takes and then judge ourselves against it?  I think not.

Follow

Get every new post delivered to your Inbox.

Join 291 other followers