Practical Session #1: Introduction

Find in news sources a general public article reporting the discovery of a software bug. Describe the bug. If possible, say whether the bug is local or global and describe the failure that manifested its presence. Explain the repercussions of the bug for clients/consumers and the company or entity behind the faulty program. Speculate whether, in your opinion, testing the right scenario would have helped to discover the fault.
Apache Commons projects are known for the quality of their code and development practices. They use dedicated issue tracking systems to discuss and follow the evolution of bugs and new features. The following link https://issues.apache.org/jira/projects/COLLECTIONS/issues/COLLECTIONS-794?filter=doneissues points to the issues considered as solved for the Apache Commons Collections project. Among those issues find one that corresponds to a bug that has been solved. Classify the bug as local or global. Explain the bug and the solution. Did the contributors of the project add new tests to ensure that the bug is detected if it reappears in the future?
Netflix is famous, among other things we love, for the popularization of Chaos Engineering, a fault-tolerance verification technique. The company has implemented protocols to test their entire system in production by simulating faults such as a server shutdown. During these experiments they evaluate the system's capabilities of delivering content under different conditions. The technique was described in a paper published in 2016. Read the paper and briefly explain what are the concrete experiments they perform, what are the requirements for these experiments, what are the variables they observe and what are the main results they obtained. Is Netflix the only company performing these experiments? Speculate how these experiments could be carried in other organizations in terms of the kind of experiment that could be performed and the system variables to observe during the experiments.
WebAssembly has become the fourth official language supported by web browsers. The language was born from a joint effort of the major players in the Web. Its creators presented their design decisions and the formal specification in a scientific paper published in 2018. The goal of the language is to be a low level, safe and portable compilation target for the Web and other embedding environments. The authors say that it is the first industrial strength language designed with formal semantics from the start. This evidences the feasibility of constructive approaches in this area. Read the paper and explain what are the main advantages of having a formal specification for WebAssembly. In your opinion, does this mean that WebAssembly implementations should not be tested?
Shortly after the appearance of WebAssembly another paper proposed a mechanized specification of the language using Isabelle. The paper can be consulted here: https://www.cl.cam.ac.uk/~caw77/papers/mechanising-and-verifying-the-webassembly-specification.pdf. This mechanized specification complements the first formalization attempt from the paper. According to the author of this second paper, what are the main advantages of the mechanized specification? Did it help improving the original formal specification of the language? What other artifacts were derived from this mechanized specification? How did the author verify the specification? Does this new specification removes the need for testing?

Answers

Software bug : I found the article of Chromium that still exists today. It seems that fixed it in some of their application but still visible in Google Maps, Youtube,.... Here is the link to the issue: https://issues.chromium.org/issues/391788835

And we can see the merge of this bug branch: https://chromium-review.googlesource.com/c/chromium/src/+/6227546/3/components/lookalikes/core/lookalike_url_util.cc#759

The bug talks about when we type some string, it will automatic convert into font ligatures, it normally let font designers special-case specific combinations of letters but it can be exploited for other things. For example it can change monospaced font like "<=" into "≤". That's the reason not to use in IDE, terminal, etc. when it could cause hazardousness levels of a safety pin.

These are the reason why this bug is a global one

This bug appears between many components like the interaction of web browser, application - font renderer - ligature's system - appeareance system domain.
Google wrote the good code but they wrongly assumed the behavior's font or ligature's system
Bug only starts where the domain has special characters.

Repercussion for Clients/Consumers Although the patch they merged didn't fix the font but only add rule to string contann a substring similiar and it doesn't prevent malicious code from replacing font with a version application doesn't have ligature. It could lead to a novel attack by replace fonts on victims devices to try to be google logo but hide the true address of that website and its malware like phishing attacks, credential theft, etc

It could also make the client/consumer lose trust in browser and lisread domain names

Repercussion for Google Reputation damage? Exploitation by hackers Liability concerns

Would Testing the Right Scenario Have Caught the Bug? Yes by security testing specially in visual spoofing but this bug is quite rare and it is normally being exploited in early 2006 era so it is understandable that it is hard to detect.

Apache Bug : COLLECTIONS-799: "UnmodifiableNavigableSet can be modified by pollFirst() and pollLast()" https://issues.apache.org/jira/projects/COLLECTIONS/issues/COLLECTIONS-799?filter=doneissues
Chaos Engineering : Read the paper and briefly explain what are the concrete experiments they perform, what are the requirements for these experiments, what are the variables they observe and what are the main results they obtained. Speculate

The experiments they performed that they said the paper are

Chaos Monkey ( random select virtual machines that host their production serveices and terminates them )
Chaos Kong ( simulate the failure of an entire Amazon EC2 )
Failure Injection Testing or FIT ( cause requests between services to fail and verify the system degrades )
Inject latency into request between services
Failure an internal service
Automate experiments to run continously

Requirements of these experiments:

Define 'steady state' as some mesurable output of a system
Hypothesize that this steady state will continue in both the control group and the experimental group
Introduce variable reflect real world
Try to disprove the hypothesis by looking for a difference in steady state between control and experimental group

Variable of these experiments:

SPS ( steady-state behavior of the system )
Fine-grain metric like an increase in request latency or CPU utilization

Main result they obtained:

When they run experiments, they revealed some weak link that they fixed before could affect customers to fail or long time load.
We cannot fully reproduce all aspects of the system within a test context
Move from a few tests into automated, large-scale testing

Is Netflix the only company performing these experiments? No, many others use Chaos Engineering too like Microsoft, Google, Amazon, Facebook, etc

How these experiments could be carried in other organizations in terms of the kind of experiment that could be performed and the system variables to observe during the experiments. For example, tt depends on the system like -Web, API: kill service instances, add latency

E-commerce: number of completed purchased per second, ad-serving service use number of ads viewed by users per second.
Banking: peak traffic scenario, disconnect external API

Variables to observe:

User latency
Success rate
Data consistency metric
Error rate

Web Assembly :

8.1 KiB Raw Blame History

Practical Session #1: Introduction

Answers

8.1 KiB

Raw Blame History