Mon 17 Jul

Session 1 FUZZING at Amazon Auditorium (Gates G20)

Welcome and Introductions

The following reviewing criteria for workshop papers can serve as a guide for us in writing papers:

Is the problem that is addressed significant for research or practice?
Are the contributions (technique, hypothesis, or evaluation) over existing work sufficient?
Is the methodology (experimental setup or protocol) specified to validate the claims or hypotheses reasonable?
Can an independent research group reproduce the results, given the proposed methodology (experimental setup)

Establish significance, novelty, and soundness, even if results do not show a large performance gain. Inspect unexpected results, such as why results are negative.

Three Colours of Fuzzing: Reflections and Open Challenges - Cristian Cadar

Why does fuzzing keep finding bugs in production software? LOTS of code is added or modified without being tested. (Covrig: A framework for the analysis of code, test, and coverage evolution in real software)

Fuzzing is not automated enough. Fuzz targets (test drivers) need to be manually specified. There is much work on improving fuzzing heuristics, but more work is required for test driver generation.

An ideal test case should benefit quality assurance, debugging aid, and documentation. They should target human users, and be small, fast, readable, and well-documented. However, automatically generated test suites, such as those generated by fuzzers, need to be improved in these aspects. They achieve high code coverage, excel at finding generic/crash bugs in general software that may not be very realistic (assertion faults, crashes, undefined behavior) but do not achieve high feature coverage and are poor at detecting logical bugs in software for specific domains.

On the other hand, such fuzzing makes it appropriate for use cases outside of security and software testing that require a novel search to find diverse failing inputs, corner cases, and loopholes, such as ML models and even investigating legal documents (Rohan).

Developers tend to be afraid of using fuzzers as they don't understand them or think of them as security tools, in contrast to a standard testing tool. Allowing fuzzing to operate at a higher declarative level and combining fuzzing with domain-specific specification languages would be beneficial.

Sound fuzzer evaluation is challenging.

Well-designed experiment methodology.
Huge variance due to randomness, demanding substantial computation resources (e.g., repeat 20x, 24 hours, X fuzzers, Y programs)

Thu 20 Jul

Keynotes at Amazon Auditorium (Gates G20)

Paper Readinging Statistics

44/97 papers accepted
Round 1: 40 submitted, 17 accepted, 9 rejected, 14 resubmit
Round 2: 57 submitted (11 resubmissions), 27 accepted, 18 rejected, 12 resubmit

Dahl-Nygaard Senior Prize: Safe Journeys into the Unknown - Object Capabilities - Sophia Drossopoulou

Think of an exciting question, such as various language features, and look into it as a research question (An Abstract Model of Java Dynamic Linking and Loading, A Flexible Model for Dynamic Linking in Java and C#).
The key for program verification is to develop formal models for a (subset) of a language, make it small and simple, and gradually expand (Java is type safe -- probably).
Actively start collaborations (Ownership, encapsulation and the disjointness of type and effect).

ISSTA 10: Test OptimizationsISSTA Technical Papers at Smith Classroom (Gates G10)

June: A Type Testability Transformation for Improved ATG Performance

Automatically generating unit tests is a powerful approach to exercising complex software. However, existing methods frequently fail to deliver appropriate input values, like strings, capable of bypassing domain-specific sanity checks. For instance, Randoop commonly uses "hi!" as a value. (Saying 'Hi!' is not enough: Mining inputs for effective test generation)

Pattern-Based Peephole Optimizations with Java JIT Tests

To demonstrate the advantage of JOG over hand-written peephole optimizations in terms of ease of writing, existing hand-written peephole optimizations are compared, and number of characters and number of lines are used as metrics.

GPUHarbor: Testing GPU Memory Consistency at Large (Experience Paper)

The tool has been implemented as a Web app using WebGPU to access the GPU, allowing the audience to try it out during the talk.

Keynote - ISSTA'24 Preview - ClosingKeynotes at Amazon Auditorium (Gates G20)

Machine Learning for Software Engineering

What underlies the success of machine learning for software engineering?

The naturalness of code. i++ is predictable given for (i = 0; i < 10;, and backward() is predictable given loss.
The bimodality of code, or code contains natural language. Q. How do I get a platform-dependent new line character? A. public static String getPlatformLineSeparator() { return System.getProperty("line.separator"); }
Code has predictable properties. Given ... = x.weight * x.height, what is the ??? in ... = y.weight * ????
Large amount of data (GitHub repos with code, version history, and commit logs, StackOverflow questions and answers, internal corpora in companies, etc.)

Conferences

ISSTA/ECOOP 2023 Observations and Gained Insights

https://abbaswu.github.io/2023/07/22/ISSTA-ECOOP-2023-Observations-and-Gained-Insights/

Author

Jifeng Wu

Posted on

July 22, 2023

Licensed under

Timetable of Well-known Conferences in Different Subdomains of Computer Science Previous

Syncing a Local Directory With a Remote Directory via rsync Next