This year at Devoxx Belgium, I attended talks on data anonymization in the public sector, reframing testing in software engineering, enhancing AI support for programmers, and improving developers’ productivity and experience. Here are the key takeaways from those conferences.
Data anonymization
From the talk “Privacy in Practice with Smart Pseudonymization: Lessons from the Belgian Public Sector” by Kristof Verslype
- Using non-anonymized (raw) data in test environements is a bad practice that already caused leaks in the past. A survey from World Quality Report, 2020, found out that 60% of organisations were doing it.
- GDPR are often overlooked when it comes to data used in test environments. This is especially tricky because test environments are typically less secure than production.
- Fictional data for testing (ex. mocks) are not great because we could miss some corner cases. It also requires more work to maintain fake data.
A solution for smart anonymization: format-preserving pseudonymization
- Pseudonymization: The structure of the original data is preserved. For example, a letter is replaced by a letter, a digit by a digit. If we have formatted data, like a social security number, we keep exactly the same format but replace the digits by other ones.
- Shuffle: For text columns like names, the values are just shuffled (so we test the actual values but it’s not possible to associate them with other sensitive values).
- Domain specific data, that is not meaningful outside of the company, is not pseudonymized or shuffled.
The entity managing the transformed data is separated from the system managing the pseudonymization.
Other anonymization use cases
The talk describes two other cases:
- Blind pseudonymization, that ensures each party only sees the information relevant to them (ex: a doctor and a pharmacist don’t need the same information).
- Oblivious join: A strategy that is currently in development phase and would be useful for research. When gathering data from different sources, the format of the data and the filtering capabilities are different from one source to another. The idea is to create an interface that is aware of the data contract and applies necessary changes (filtering, anonymization) on data coming from different sources, before making it available to the research environment.
Software Engineering testing: moving away from Unit Tests focus?
From the talk “Testing done right: From bugs to brilliance” by Wouter Bauweraerts
Prerequisites for testing done right
- Clear requirements from the business.
- Acceptance criteria:
- written from the end-user perspective
- clear, domain specific language, understood by the business and the technical people
- limited to ticket scope (no superfluous info)
- describing the outcome (not the step-by-step implementation)
- testable
Opinion: integration and end-to-end tests are more important than unit tests
- We still need Unit tests: they improve trust in the code, they are cheaper and faster to write than other tests. However, Unit tests only test the implementation and not the behavior.
- The “Testing trophy” diagram redefines the importance of each kind of test, in comparison with the classic “Testing pyramid” where Unit tests are predominant.
- When developing microservices, each service has its own testing trophy.
- Don’t mock too much: refactoring the code makes the mocking tests fail.
- Don’t generate tests based on code, especially with generative AI tools: if there’s a bug in the implemented code, the test will consider the bug as the normal behavior and write a test that adapts to it. If the code is fixed, the test could break!
How to do test driven development (TDD)
- Write the test (based on acceptance criteria)
- Write the minimal code implementation to have the test pass
- Enrich the code iteratively and test after each change
This also helps breaking down big functionalities into small steps and small functions.
TDD requires a lot of discipline, and it takes time to get used to it. Beware that TDD could also give a false sense of security.
Integration tests
Integration tests are more centered on behavior.
Behavior Driven Development (BDD):
- Based on acceptance criteria that are translated to behavior tests
- Human readable format (Gherkin syntax) is encouraged and achieved with frameworks like Cucumber
- BDD works in combination with TDD, they’re not exclusive and do not replace the other
- Integration tests are feature-scoped
Enriching AI assistants with human knowledge
From the talk “AI and Code Quality: Building a Synergy with Human Intelligence” by Arthur Magne
- AI amplifies the information it finds, whether it’s good or bad (principle of garbage in, garbage out).
- Studies showed that performance with AI assistants barely improved, but teams performance decreased, and chum code increased.
- We can get the best of AI assistants with human guidance.
- The solution Packmind allows to collect good practices validated inside the team and use it as a source for AI assistants.
Developer’s productivity = Developer’s joy
From the talk “Productivity is Messing Around and Having Fun” by Trisha Gee and Holly Cummins
- Making developers more productive is making them more joyful. Happiness + Productivity = double-win, it’s not a tradeoff.
- Automate the boring and repetitive stuff, leave the fun part to the human.
- Developers like to write code, but not code that has already been written many times.
- Tests can become a burden when they take too long.
- Run only tests that regularly fail in early stages.
- Use parallelization to distribute tests.
- Beware of flaky tests: keep track of tests that have already failed a lot in the past and flag them as flaky. It’s a clue that the failure of the test might not be linked to the latest or ongoing development.
Measuring developers’ productivity
- Measuring developers’ productivity by the number of lines of code produced is not relevant: it’s possible to write the same logic with more or less lines of code. This metric could produce make some developers write bad code for the sake of meeting numbers. Productivity is writing more features.
- The Person/Month is not a metric that works well in real life (see The Mythical Man-Month).
- The SPACE framework seems a better approach to measure developers’ productivity. It relies on Satisfaction, Performance, Activity, Communication and Collaboration, and Efficiency and Flow.
- Developing a feature is not just typing code but also thinking. Developers need space and time to think, they can’t be writing code non-stop.
Keeping a creative mind as a developer
- Developers need idleness to allow for the default mode network to do its job. Embrace the dead time!
- When working from home, breaks like unloading the dishwasher can unlock the default mode. Others achieve this by doing sports like jogging.
- Fidgeting, knitting, drawing… helps some people focusing. If you need to keep your hands busy during meetings, explain it to your manager and colleagues. From their end, these behaviors shouldn’t be judged as unprofessional.
- We work in a creative industry, and playing helps creativity.
- Make time for boredom and make time for play.
Bonus: Using IntelliJ as a game engine
Talking about having fun, I really enjoyed this talk “Let’s use IntelliJ as a game engine, just because we can” by Alexander Chatzizacharias. I didn’t take any notes (indulging myself with some idleness) and anyway it’s way more fun to watch it. Enjoy!