Publications

Do Developers Fix Continuous Integration Smells?
Ayberk Yaşa, Ege Ergül, Hakan Erdogmus, Eray Tuzun
PROMISE 2023: Proceedings of the 19th International Conference on Predictive Models and Data Analytics in Software Engineering

Context: Continuous Integration (CI) is a common software engineering practice in which the code changes are frequently merged into a software project repository after automated builds and tests have been successfully run. CI enables developers to quickly detect bugs, enhance the quality of the code, and shorten review times. However, developers may encounter some obstacles in following the CI principles. They may be unaware of them, they may follow the principles partially or they may even act against them. These behaviors result in CI smells. CI smells may in turn lessen the benefits of CI. Addressing CI smells rapidly allows software projects to fully reap the benefits of CI and increase its effectiveness. Objective: The main objective of this study is to investigate how frequently developers address CI smells. Methods: To achieve this objective, we first selected seven smells, then implemented scripts for detecting these smells automatically, and then ran the scripts in eight open-source software projects using GitHub Actions. To assess the resolution extent of CI smells by practitioners, we calculated the occurrences and time-to-resolution (TTR) of each smell. Results: Our results suggest that Skipped Job smell has been fixed slightly more than other CI smells. The most frequently observed smell was Long Build, which was detected in an average of 19.03% of all CI builds. Fake Success smell does not get resolved in projects where it exists. Conclusion: Our study reveals that practitioners do not fix CI smells in practice. Further studies are needed to explore the underlying reasons behind this, in order to recommend more effective strategies for addressing these smells.

An Empirical Analysis of Issue Templates Usage in Large-Scale Projects on GitHub
Emre Sülün, Metehan Saçakçı, Eray Tuzun
ACM Transactions on Software Engineering and Methodology

Context: GitHub Issues is a widely used issue tracking tool in open-source software projects. Originally designed with broad flexibility, its lack of standardization led to incomplete issue reports, impeding software development and maintenance efficiency. To counteract this, GitHub introduced issue templates in 2016, which rapidly became popular. Objective: Our study assesses the current use and evolution of these templates in large-scale open-source projects and their impact on issue tracking metrics, including resolution time, number of reopens, and number of issue comments. Methods: Employing a comprehensive analysis of 350 templates from 100 projects, we also evaluated over 1.9 million issues for template conformity and impact. Additionally, we solicited insights from open-source software maintainers through a survey. Results: Our findings highlight issue templates’ extensive usage in 99 of the 100 surveyed projects, with a growing preference for YAML-based templates, a more structured template variant. Projects with a template exhibited markedly reduced resolution time (381.02 days to 103.18 days) and reduced issue comment count (4.95 to 4.32) compared to those without. The use of YAML-based templates further significantly decreased resolution time, the number of reopenings, and the discussion extent. Conclusion: Thus, our research underscores issue templates’ positive impact on large-scale open-source projects, offering recommendations for improved effectiveness.

BFSig: Leveraging File Significance in Bus Factor Estimation
Vahid Haratian, Mikhail Evtikhiev, Pouria Derakhshanfar, Eray Tuzun, Vladimir Kovalenko
ESEC/FSE 2023: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering

Context: Software projects experience the departure of developers due to various reasons. As developers are one of the main sources of knowledge in software projects, their absence will inevitably result in a certain degree of knowledge depletion. Bus Factor (BF) is a metric to evaluate how this knowledge loss can affect the project’s continuity. Conventionally, BF is calculated as the smallest set of developers, removing over half the project knowledge upon departure. Objective: Current state-of-the-art approaches measure developers’ knowledge by the number of authored files, utilizing version control system (VCS) information. However, numerous studies have shown that files in software projects have different significance. In this study, we explore how weighting files according to their significance affects the performance of two prevailing BF estimators. Methods: We derive significance scores by computing five well-known graph metrics from the project’s dependency graph: PageRank, In- /Out-/All-Degree, and Betweenness Centralities. Furthermore, we introduce BFSig, a prototype of our approach. Finally, we present a new dataset comprising reported BF scores collected by surveying software practitioners from five prominent Github repositories. Results: Our results indicate that BFSig outperforms the baselines by up to an 18% reduction in terms of Normalized Mean Absolute Error (NMAE). Moreover, BFSig yields 18% fewer False Negatives in identifying potential risks associated with low BF. Besides, our respondent confirmed BFSig versatility by showing its ability to assess the BF of the project’s subfolders. Conclusion: In conclusion, we believe to estimate BF from authorship, software components of higher importance should be assigned heavier weight. Currently, BFSig exclusively explores the topological characteristics of these components. Nevertheless, considering attributes such as code complexity and bug proneness could potentially enhance the performance of BFSig.

Analyzing Bug Life Cycles to Derive Practical Insights
Çağrı Eren, Kerem Şahin, Eray Tuzun
27th International Conference on Evaluation and Assessment in Software Engineering (EASE 2023)

Context: Bug tracking systems define bug life cycles that outline their bug tracking process. Objective: In this study, we assess bug life cycles to identify bottlenecks in the bug tracking processes, and examine the effectiveness of bug tracking system usage practices linked to bug states and state transitions. Methods: To achieve this, we examined the bug life cycles of three open-source software projects which use Bugzilla as their bug tracking system. In total, we have analyzed 106.196 bugs gathered from these projects. We started by looking at the temporal and quantitative aspects of these projects’ bug life cycles. After that, we collected data about how bug life cycles differ over time. Finally, we inspected the frequency of reopened and state-looping bugs in these projects. Results: After our analysis, we have deduced that the presented temporal and quantitative analysis of bug life cycles is useful for finding bottlenecks and undesired behaviors in the bug tracking processes. We also inferred that examining the changes in bug life cycles over time can provide insights into how bug tracking practices changed throughout the project’s lifetime, and it can be used as a parameter to assess whether the bug tracking system usage has improved. Lastly, we deducted that analyzing undesired state trails’ frequency provides insights into the performance of bug tracking processes. Conclusion: Based on the insights gained from analyzing bug life cycles with the presented methods, we believe that decision makers can improve their workflow by introducing or removing new states to the bug life cycle and adding new rules and restrictions to their bug tracking process.

Towards Better Code Reviews: Using Mutation Testing to Improve Reviewer Attention
Ziya Mukhtarov, Mannan Abdul, Mokhlaroyim Raupova, Javid Baghirov, Osama Tanveer, Haluk Altunel, Eray Tüzün
The International Conference on Software and System Processes (ICSSP 2023)

Context: Code reviews, while effective, can be crippled by process smells if not performed correctly. A typical process smell that harms the efficacy of code reviews is the ‘Looks Good To Me’ (LGTM) smell, wherein a reviewer approves a code review task without reviewing the code attentively. Low-quality code reviews can be harmful, as they can cause bugs to slip into a product codebase leading to potentially severe consequences. Objective: In this paper, we propose an innovative solution to potentially minimize the occurrence of the LGTM smell commonly found in code reviews. Methods: We built a tool that is a proof-of-concept implementation of our solution, which incorporates the concept of mutation testing into code reviews. It provides a platform where pull request authors can apply mutations to the pull request code in GitHub. Results: Reviewer attention and review efficacy are measured based on their mutation score. We validated our proposed solution with eight developers and received promising results. Conclusion: To the best of our knowledge, our proof of concept implementation is the first-ever code review tool that uses the concept of mutation testing.

Assessing the quality of GitHub copilot’s code generation
Burak Yetistiren, Isik Ozsoy, Eray Tuzun
18th International Conference on Predictive Models and Data Analytics in Software Engineering

Context: The introduction of GitHub’s new code generation tool, GitHub Copilot, seems to be the first well-established instance of an AI pair-programmer. GitHub Copilot has access to a large number of open-source projects, enabling it to utilize more extensive code in various programming languages than other code generation tools. Although the initial and informal assessments are promising, a systematic evaluation is needed to explore the limits and benefits of GitHub Copilot. Objective: The main objective of this study is to assess the quality of generated code provided by GitHub Copilot. We also aim to evaluate the impact of the quality and variety of input parameters fed to GitHub Copilot. Methods: To achieve this aim, we created an experimental setup for evaluating the generated code in terms of validity, correctness, and efficiency. Results: The results suggest that GitHub Copilot was able to generate valid code with a 91.5% success rate. In terms of code correctness, out of 164 problems, 47 (28.7%) were correctly, while 84 (51.2%) were partially correctly, and 33 (20.1%) were incorrectly generated. Conclusion: Our empirical analysis shows that GitHub Copilot is a promising tool based on the results we obtained, however further and more comprehensive assessment is needed in the future.

Cleaning ground truth data in software task assignment
K. Ayberk Tecimer, Eray Tüzün, Cansu Moran, Hakan Erdogmus
Information and Software Technology Journal 2022

Context: In the context of collaborative software development, there are many application areas of task assignment such as assigning a developer to fix a bug, or assigning a code reviewer to a pull request. Most task assignment techniques in the literature build and evaluate their models based on datasets collected from real projects. The techniques invariably presume that these datasets reliably represent the “ground truth”. In a project dataset used to build an automated task assignment system, the recommended assignee for the task is usually assumed to be the best assignee for that task. However, in practice, the task assignee may not be the best possible task assignee, or even a sufficiently qualified one. Objective: We aim to clean up the ground truth by removing the samples that are potentially problematic or suspect with the assumption that removing such samples would reduce any systematic labeling bias in the dataset and lead to performance improvements. Methods: We devised a debiasing method to detect potentially problematic samples in task assignment datasets. We then evaluated the method’s impact on the performance of seven task assignment techniques by comparing the Mean Reciprocal Rank (MRR) scores before and after debiasing. We used two different task assignment applications for this purpose: Code Reviewer Recommendation (CRR) and Bug Assignment (BA). Results: In the CRR application, we achieved an average MRR improvement of 18.17% for the three learning-based techniques tested on two datasets. No significant improvements were observed for the two optimization-based techniques tested on the same datasets. In the BA application, we achieved a similar average MRR improvement of 18.40% for the two learning-based techniques tested on four different datasets. Conclusion: Debiasing the ground truth data by removing suspect samples can help improve the performance of learning-based techniques in software task assignment applications.

Taxonomy of bug tracking process smells: Perceptions of practitioners and an empirical analysis
Khushbakht Ali Qamar, Emre Sülün, Eray Tüzün
Information and Software Technology Journal 2022

Context: While there is no consensus on a formally specified bug tracking process, some certain rules and best practices for an optimal bug tracking process are accepted by many companies and open-source software (OSS) projects. Despite slight variations between different platforms, the primary aim of all these rules and practices is to perform a more efficient bug tracking process. Practitioners’ non-compliance with the best practices not only impedes the benefits of the bug tracking process but also negatively affects the other phases of software development life cycle. Objective: The goal of this study is to gain a better knowledge of the bad practices that occur during the bug tracking process (bug tracking process smells) and to perform quantitative analysis to show that these process smells exist in bug tracking systems. Moreover, we want to know the perception of software practitioners related to these process smells and also observe the impact of process smells on the bug tracking process. Methods: Based on the results of a multivocal literature review, we analyzed 60 sources in academic and gray literature and propose a taxonomy of 12 bad practices in the bug tracking process. To quantitatively analyze these process smells, we inspected bug reports collected from eight projects which use Jira, Bugzilla, and GitHub Issues. To get an idea about the perception of practitioners about the taxonomy of bug tracking process smells, we conducted a targeted survey with 30 software practitioners. Moreover, we statistically analyzed the impact of bug tracking process smells on the resolution time and reopening count of bugs. Results: We observed from our empirical results that a considerable amount of bug tracking process smells exist in all projects and some of the process smell categories have statistically significant impacts on quality and speed. Survey results shows that the majority of software practitioners agree with the proposed taxonomy of BT process smells. Conclusion: The statistical analysis reveals that bug tracking process smells have an impact on OSS projects. The proposed taxonomy may serve as a foundation for best practices and tool support for detecting and avoiding bug tracking process smells.

Investigating the Impact of Forgetting in Software Development
Utku Ünal, Eray Tüzün, Tamer Gezici, Ausaf Ahmed Farooqui
19th International Conference on Mining Software Repositories (MSR 2022)

Context: Forgetting is defined as a gradual process of losing information. Even though there are many studies demonstrating the effect of forgetting in software development, to the best of our knowledge, no study explores the impact of forgetting in software development using a controlled experiment approach. Objective: We would like to provide insights on the impact of forgetting in software development projects. We want to examine whether the recency & frequency of interaction impact forgetting in software development. Methods: We will conduct an experiment that examines the impact of forgetting in software development. Participants will first do an initial task. According to their initial task performance, they will be assigned to either the experiment or the control group. The experiment group will then do two additional tasks to enhance their exposure to the code. Both groups will then do a final task to see if additional exposure to the code benefits the experiment group's performance in the final task. Finally, we will conduct a survey and a recall task with the same participants to collect data about their perceptions of forgetting and quantify their memory performance, respectively.

Analyzing developer contributions using artifact traceability graphs
H. Alperen Cetin, Eray Tuzun
Empirical Software Engineering

Context: In a software project, properly analyzing the contributions of developers could provide valuable insights for decision-makers. The contributions of a developer could be in many different forms such as committing and reviewing code, opening and resolving issues. Previous approaches mainly consider the commit-based contributions which provide an incomplete picture of developer contributions. Objective: Different from the traditional commit-based approaches for analyzing developer contributions, we aim to provide a more holistic approach to reflect the rich set of software development activities using artifact traceability graphs. Method: For analyzing the developer contributions, we propose a novel categorization of developers (Jacks, Mavens and Connectors) in a software project. We introduce a set of algorithms on artifact traceability graphs to identify key developers, recommend replacements for leaving developers and evaluate knowledge distribution among developers. Results: We evaluate our proposed algorithms on six open-source projects and demonstrate that the identified key developers match the top commenters up to 98%, recommended replacements are correct up to 91% and identified knowledge distribution labels are compatible 94% on average with the baseline approaches. Conclusions: The proposed algorithms using artifact traceability graphs for analyzing developer contributions could be used by software project decision-makers in several scenarios. (1) Identifying different types of key developers. (2) Finding a replacement developer in large teams. (3) Evaluating the overall knowledge distribution amongst developers to take early precautions.

Characterizing duplicate bugs: Perceptions of practitioners and an empirical analysis
Berfin Kucuk,Idil Hanhan,Eray Tuzun
Journal of Software: Evolution and Process

Bug handling is an essential part of the software development process. Ideally, in a bug-tracking system, bugs are reported, fixed, verified, and closed. In some cases, bugs have to be reopened mostly due to an incorrect fix. However, instead of reopening the existing bug report, users may submit a new report for a previously reported bug, which results in duplicate bug reports. Users might report duplicate bugs if they miss the previous bug report or if the previous bug is unresolved (i.e., in progress) and therefore cannot be reopened. These duplicate bug reports may cost extra maintenance efforts in triaging and bug fixing. There have been several studies on characterizing reopened bugs and duplicate bug reports; however, to the best of our knowledge, there has been no prior work on understanding the dynamics of their interaction, which is missed reopen bugs. Our study is based on analyzing the difference between duplicate and nonduplicate bugs and further categorizing the duplicate bugs. In this regard, we categorize duplicate bugs according to the original bug's resolution status at the duplicate's creation time as Master-Unresolved bugs and Master-Resolved (Missed Reopen) bugs to distinguish their properties. We compare these two types of bugs in terms of their relationship to their master bug, bug surface time, bug fix time, bug's severity, and the number of users involved. We perform case studies using the Eclipse and Mozilla projects' bug repositories that include more than 165,500 and 394,000 bug reports, respectively. Additionally, we investigate the perceived importance, impact, and causes of duplicate bugs, as well as the difference between nonduplicate and duplicate bugs and its categories for practitioners in the software industry by conducting a survey.

Bus Factor In Practice
Elgun Jabrayilzade, Mikhail Evtikhiev, Eray Tüzün, Vladimir Kovalenko
44th International Conference on Software Engineering (ICSE 2022)

Bus factor is a metric that identifies how resilient is the project to the sudden engineer turnover. It states the minimal number of engineers that have to be hit by a bus for a project to be stalled. Even though the metric is often discussed in the community, few studies consider its general relevance. Moreover, the existing tools for bus factor estimation focus solely on the data from version control systems, even though there exists other channels for knowledge generation and distribution. With a survey of 269 engineers, we find that the bus factor is perceived as an important problem in collective development, and determine the highest impact channels of knowledge generation and distribution in software development teams. We also propose a multimodal bus factor estimation algorithm that uses data on code reviews and meetings together with the VCS data. We test the algorithm on 13 projects developed at JetBrains and compared its results to the results of the state-of-the-art tool by Avelino et al. against the ground truth collected in a survey of the engineers working on these projects. Our algorithm is slightly better in terms of both predicting the bus factor as well as key developers compared to the results of Avelino et al. Finally, we use the interviews and the surveys to derive a set of best practices to address the bus factor issue and proposals for the possible bus factor assessment tool.

Towards a Taxonomy of Inline Code Comment Smells
Elgun Jabrayilzade, Olcaytu Gürkan, Eray Tüzün
IEEE 21st International Working Conference on Source Code Analysis and Manipulation (SCAM 2021)

Code comments play a vital role in source code comprehension and software maintainability. It is common for developers to write comments for explaining a code snippet. However, low-quality comments can have a detrimental effect on software quality or be ineffective for code understanding. This study aims to create a taxonomy of inline code comment smells and determine how commonly each smell type occurs in software projects. We conducted a multivocal literature review for defining the initial taxonomy of inline comment smells. Afterward, we manually labeled 899 inline comments from three open-source Java projects. We created a taxonomy of 11 inline code comment smell types and found out that the smells exist in practice with varying degrees.

Bug Tracking Process Smells In Practice
Erdem Tuna, Vladimir Kovalenko, Eray Tüzün
44th International Conference on Software Engineering (ICSE 2022)

Software teams use bug tracking (BT) tools to report and manage bugs. Each record in a bug tracking system (BTS) is a reporting entity consisting of several information fields. The contents of the reports are similar across different tracking tools, though not the same. The variation in the workflow of teams prevents defining an ideal process of running BTS. Nevertheless, there are best practices reported both in white and gray literature. Developer teams may not adopt the best practices in their BT process. This study investigates the non-compliance of developers with best practices, so-called smells, in the BT process. We mine bug reports of four projects in the BTS of JetBrains, a software company, to observe the prevalence of BT smells in an industrial setting. Also, we survey developers to see (1) if they recognize the smells,(2) their perception of the severity of the smells, and (3) the potential benefits of a BT process smell detection tool. We found that (1) smells occur, and their detection requires a solid understanding of the BT practices of the projects,(2) smell severity perception varies across smell types, and (3) developers considered that a smell detection tool would be useful for six of the smell categories.

Teaching Design Patterns Using Interactive Methods
Ata Yurtsever, Eray Tüzün
CSEE&T / HICSS 2022 ▪ Conference on Software Engineering Education and Training: Special Track of Hawaii International Conference on System Sciences

Even though design patterns are one of the most important building blocks in the current software engineering ecosystem, computer science and software engineering graduates face trouble applying these patterns. To address this, we propose a tutorial and an online lab assessment method to solidify the idea of design patterns for students. The tutorial part integrates a live coding session. The online lab assessment consists of a three-stage process (designing a solution using a class diagram, peer review, and implementation) where students are expected to come up with a fully working solution using design patterns. The proposed approach is applied twice over two semesters to a total sum of 196 students. We discuss the effects of these interactive educational methods on learning by comparing pre-surveys, post-surveys and analyzing final grades. The analysis of the surveys shows that live coding is highly beneficial in enhancing the understanding of design patterns.

An Interactive Approach to Teaching Git Version Control System
Elgun Jabrayilzade, Fatih Sevban Uyanık, Emre Sülün, Eray Tüzün
CSEE&T / HICSS 2022 ▪ Conference on Software Engineering Education and Training: Special Track of Hawaii International Conference on System Sciences

Although the Git version control system is widely used in software engineering, it has been observed that most Computer Science and Software Engineering students do not have the necessary knowledge and practices to use Git. To address this issue, we have prepared a Git and GitHub training program consisting of four sessions as a part of the Object-Oriented Software Engineering course where junior students utilized these tools for their term projects. The program was conducted in three academic terms for a total of 258 students. To evaluate the effectiveness of the training sessions, we have conducted two surveys, before (224 respondents) and after (200 respondents) the program. According to the survey results, the number of students considering themselves insufficient to use the tools for their projects decreased from 67% to 9% after the training program. Additionally, the majority of the students found the lectures and laboratory assignments beneficial

Towards a taxonomy of code review smells
Emre Doğan, ErayTüzün
Information and Software Technology

Code review is a crucial step of the software development life cycle in order to detect possible problems in source code before merging the changeset to the codebase. Although there is no consensus on a formally defined life cycle of the code review process, many companies and open source software (OSS) communities converge on common rules and best practices. In spite of minor differences in different platforms, the primary purpose of all these rules and practices leads to a faster and more effective code review process. Non-conformance of developers to this process does not only reduce the advantages of the code review but can also introduce waste in later stages of the software development. Objectives: The aim of this study is to provide an empirical understanding of the bad practices followed in the code review process, that are code review (CR) smells. Methods: We first conduct a multivocal literature review in order to gather code review bad practices discussed in white and gray literature. Then, we conduct a targeted survey with 32 experienced software practitioners and perform follow-up interviews in order to get their expert opinion. Based on this process, a taxonomy of code review smells is introduced. To quantitatively demonstrate the existence of these smells, we analyze 226,292 code reviews collected from eight OSS projects. Results: We observe that a considerable number of code review smells exist in all projects with varying degrees of ratios. The empirical results illustrate that 72.2% of the code reviews among eight projects are affected by at least one code review smell. Conclusion: The empirical analysis shows that the OSS projects are substantially affected by the code review smells. The provided taxonomy could provide a foundation for best practices and tool support to detect and avoid code review smells in practice.

Augmenting Code Review Experience Through Visualization
Faruk Balcı, Dilruba Sultan Haliloğlu, Onur Şahin, Cankat Tilki, Mehmet Ata Yurtsever, Eray Tüzün
Working Conference on Software Visualization (VISSOFT 2021)

Code review is a systematic inspection of the code-base. It ensures that the software satisfies the required functionalities and standards; thus, it is an essential stage in the modern development process and is used frequently in the industry. For a successful review, the reviewer should be able to identify defects; therefore, the process is highly dependent on the awareness of the reviewer. Currently, code review is done by comparing the line-by-line differences on the codebase. However, this does not give structural information, such as design pattern changes or dependency changes between services.To address this problem, we propose an augmented next-generation code review experience using visual and holistic approaches to streamline the reviewing process. The proposed code review experience shows the structural representation of the change by visualizing it into a UML-like relationship diagram. Along with this diagram, the risk percentage for each commit is highlighted to emphasize the differences that possibly can be most affected by the change. An artifact map is also produced that includes issue-commit relation for bug tracking, to make reviewers aware of the frequent issues that are brought up. This map also highlights the change frequency of the files, for focusing the reviewers on the possibly vulnerable parts of the software for better quality reviews. The proposed visualizations have been implemented in a plugin-based proof-of-concept tool integrated within GitHub, supporting the visual reviewing of changes in codebases developed with Java language.

A Fine-grained Data Set and Analysis of Tangling in Bug Fixing Commits
Steffen Herbold, Alexander Trautsch, Benjamin Ledel, Alireza Aghamohammadi, Taher Ahmed Ghaleb, Kuljit Kaur Chahal, Tim Bossenmaier, Bhaveet Nagaria, Philip Makedonski, Matin Nili Ahmadabadi, Kristóf Szabados, Helge Spieker, Matej Madeja, Nathaniel Hoy, Valentina Lenarduzzi, Shangwen Wang, Gema Rodriguez Perez, Ricardo Colomo-Palacios, Roberto Verdecchia, Paramvir Singh,Yihao Qin, Debasish Chakroborti, Willard Davis, Vijay Walunj, Hongjun Wu, Diego Marcilio, Omar Alam, Abdullah Aldaeej, Idan Amit, Burak Turhan, Simon Eismann, Anna-Katharina Wickert, Ivano Malavolta, Matúš Sulír, Fatemeh Fard, Austin Z Henley, Stratos Kourtzanidis, Eray Tüzün, Christoph Treude, Simin Maleki Shamasbi, Ivan Pashchenko, Marvin Wyrich, James C. Davis, Alexander Serebrenik, Ella Albrecht, Ethem Utku Aktas, Daniel Strüber, Johannes Erbel
Empirical Software Engineering

Context: Tangled commits are changes to software that address multiple concerns at once. For researchers interested in bugs, tangled commits mean that they actually study not only bugs, but also other concerns irrelevant for the study of bugs. Objective: We want to improve our understanding of the prevalence of tangling and the types of changes that are tangled within bug fixing commits. Methods: We use a crowd sourcing approach for manual labeling to validate which changes contribute to bug fixes for each line in bug fixing commits. Each line is labeled by four participants. If at least three participants agree on the same label, we have consensus. Results: We estimate that between 17% and 32% of all changes in bug fixing commits modify the source code to fix the underlying problem. However, when we only consider changes to the production code files this ratio increases to 66% to 87%. We find that about 11% of lines are hard to label leading to active disagreements between participants. Due to confirmed tangling and the uncertainty in our data, we estimate that 3% to 47% of data is noisy without manual untangling, depending on the use case. Conclusion: Tangled commits have a high prevalence in bug fixes and can lead to a large amount of noise in the data. Prior research indicates that this noise may alter results. As researchers, we should be skeptics and assume that unvalidated data is likely very noisy, until proven otherwise.

What Makes Agile Software Development Agile
M. Kuhrmann,P Tell,R Hebig,J Ann-Christin Klunder,J Munch,O Linssen,D Pfahl,M Felderer,C Prause,S Macdonell,J Nakatumba-Nabende,D Raffo,S Beecham,E. Tuzun,G Lopez,N Paez,D Fontdevila,S Licorish,S Kupper,G Ruhe,E Knauss,O Ozcan-Top,P Clarke,F Hugh Mc Caffery,M Genero,A Vizcaino,M Piattini,M Kalinowski,T Conte,R Prikladnicki,S Krusche,A Coskuncay,E Scott,F Calefato,S Pimonova,R Pfeiffer,U Pagh Schultz,R Heldal,M Fazal-Baqaie,C Anslow,M Nayebi,K Schneider,S Sauer,D Winkler,S Biffl,C Bastarrica,I Richardson
IEEE Transactions on Software Engineering

Together with many success stories, promises such as the increase in production speed and the improvement in stakeholders' collaboration have contributed to making agile a transformation in the software industry in which many companies want to take part. However, driven either by a natural and expected evolution or by contextual factors that challenge the adoption of agile methods as prescribed by their creator(s), software processes in practice mutate into hybrids over time. Are these still agile In this article, we investigate the question: what makes a software development method agile We present an empirical study grounded in a large-scale international survey that aims to identify software development methods and practices that improve or tame agility. Based on 556 data points, we analyze the perceived degree of agility in the implementation of standard project disciplines and its relation to used development methods and practices. Our findings suggest that only a small number of participants operate their projects in a purely traditional or agile manner (under 15%). That said, most project disciplines and most practices show a clear trend towards increasing degrees of agility. Compared to the methods used to develop software, the selection of practices has a stronger effect on the degree of agility of a given discipline. Finally, there are no methods or practices that explicitly guarantee or prevent agility. We conclude that agility cannot be defined solely at the process level. Additional factors need to be taken into account when trying to implement or improve agility in a software company. Finally, we discuss the field of software process-related research in the light of our findings and present a roadmap for future research.

Ground Truth Deficiencies in Software Engineering: When Codifying the Past Can be Counterproductive
E. Tüzün, H. Erdogmus, M. T. Baldassarre, M. Felderer, R. Feldt and B. Turhan
IEEE Software

Many software engineering tools build and evaluate their models based on historical data to support development and process decisions. These models help us answer numerous interesting questions, but have their own caveats. In a real-life setting, the objective function of human decision-makers for a given task might be influenced by a whole host of factors that stem from their cognitive biases, subverting the ideal objective function required for an optimally functioning system. Relying on this data as ground truth may give rise to systems that end up automating software engineering decisions by mimicking past sub-optimal behaviour. We illustrate this phenomenon and suggest mitigation strategies to raise awareness.

Detection and Elimination of Systematic Labeling Bias in Code Reviewer Recommendation Systems
Kazım Ayberk Tecimer, Eray Tüzün, Eray Tüzün, Hamdi Dibeklioğlu, Hakan Erdogmus
The International Conference on Evaluation and Assessment in Software Engineering (EASE 2021)

Reviewer selection in modern code review is crucial for effective code reviews. Several techniques exist for recommending reviewers appropriate for a given pull request (PR). Most code reviewer recommendation techniques in the literature build and evaluate their models based on datasets collected from real projects using open-source or industrial practices. The techniques invariably presume that these datasets reliably represent the "ground truth. " In the context of a classification problem, ground truth refers to the objectively correct labels of a class used to build models from a dataset or evaluate a model's performance. In a project dataset used to build a code reviewer recommendation system, the recommended code reviewer picked for a PR is usually assumed to be the best code reviewer for that PR. However, in practice, the recommended code reviewer may not be the best possible code reviewer, or even a qualified one. Recent code reviewer recommendation studies suggest that the datasets used tend to suffer from systematic labeling bias, making the ground truth unreliable. Therefore, models and recommendation systems built on such datasets may perform poorly in real practice. In this study, we introduce a novel approach to automatically detect and eliminate systematic labeling bias in code reviewer recommendation systems. The bias that we remove results from selecting reviewers that do not ensure a permanently successful fix for a bug-related PR. To demonstrate the effectiveness of our approach, we evaluated it on two open-source project datasets-HIVE and QT Creator-and with five code reviewer recommendation techniques-Profile-Based, RSTrace, Naive Bayes, k-NN, and Decision Tree. Our debiasing approach appears promising since it improved the Mean Reciprocal Rank (MRR) of the evaluated techniques up to 26% in the datasets used.

Towards a Taxonomy of Bug Tracking Process Smells: A Quantitative Analysis
Khushbakht Ali Qamar, Emre Sülün, Eray Tüzün
EuroMicro 2021

Code review is the process of inspecting code changes by a developer who is not involved in the development of the changeset. One of the initial and important steps of code review process is selecting code reviewer(s) for a given code change. To maximize the benefits of the code review process, the appropriate selection of the reviewer is essential. Code reviewer recommendation has been an active research area over the last few years, and many recommendation models have been proposed in the literature.
Bug tracking is the process of monitoring and reporting malfunctions or issues found in software. While there is no consensus on a formally specified bug tracking process, some certain rules and best practices for an optimal bug tracking process are accepted by many companies and open-source software (OSS) projects. Despite slight variations between different platforms, the primary aim of all these rules and practices is to perform a more efficient bug tracking process. Practitioners' noncompliance with the best practices not only impedes the benefits of the bug tracking process but also negatively affects the other phases of the life cycle of software development. In this study, based on the results of a multivocal literature review, we analyzed 60 sources in academic and gray literature and propose a taxonomy of 12 bad practices in the bug tracking process, that is bug tracking process smells. To quantitatively analyze these process smells, we inspect bug reports collected from six projects. Among these projects, four of them are Jirabased (MongoDB Core Server, Evergreen, Confluence Server & Data Center, Jira Server & Data Center) and the other two are Bugzilla-based (GCC and Wireshark). We observed that a considerable amount of bug tracking process smells exist in all projects with varying ratios.

A Review of Code Reviewer Recommendation Studies: Challenges and Future Directions
H. Alperen Cetin, Emre Doğan, Eray Tüzün
Science of Computer Programming Journal 2021

Code review is the process of inspecting code changes by a developer who is not involved in the development of the changeset. One of the initial and important steps of code review process is selecting code reviewer(s) for a given code change. To maximize the benefits of the code review process, the appropriate selection of the reviewer is essential. Code reviewer recommendation has been an active research area over the last few years, and many recommendation models have been proposed in the literature.
In this study, we conduct a systematic literature review by inspecting 29 primary studies published from 2009 to 2020. Based on the outcomes of our review: (1) most preferred approaches are heuristic approaches closely followed by machine learning approaches, (2) the majority of the studies use open source projects to evaluate their models, (3) the majority of the studies prefer incremental training set validation techniques, (4) most studies suffer from reproducibility problems, (5) model generalizability and dataset integrity are the most common validity threats for the models and (6) refining models and conducting additional experiments are the most common future work discussions in the studies.

Characterizing Duplicate Bugs: An Empirical Analysis
Berfin Küçük, Eray Tüzün
The Third International Workshop On Intelligent Bug Fixing, 2021

Bug handling is an essential part of the softwaredevelopment process. Ideally, in a bug tracking system, bugs arereported, fixed, verified, and closed. In some cases, bugs haveto be reopened mostly due to an incorrect fix. However, insteadof reopening the existing bug report, users may submit a newreport on a previously reported bug, which causes duplicate bugreports. Additionally, users might report duplicate bugs if theyare unable to reopen the previously reported bugs due to the bugbeing unresolved (i.e., in progress) and when they miss previouslyreported bug reports. These duplicate bug reports may cost extramaintenance efforts in triaging and fixing bugs.There have been several studies on characterizing reopenedbugs and duplicate bug reports, however, to the best of ourknowledge, there has been no prior work on understanding thedynamics of their intersection, which ismissed reopenbugs. Ourstudy is based on analyzing the differences between duplicateand non-duplicate bugs, and further categorizing the duplicatedbugs. In this regard, we categorize duplicate bugs accordingto their creation time with respect to their master’s resolutionstatus as Master-Unresolved bugs and Master-Resolved (MissedReopen bugs) to distinguish their properties. We compare thesetwo different types of bugs in terms of various aspects such astheir relationships to their master bugs, bug surface time, bugfix time, bug’s severity, and the number of users involved. Weperform case studies using the Eclipse and Mozilla projects’ bugrepositories that include more than 165,500 and 394,000 bugreports respectively.

RSTrace+: Reviewer Suggestion using Software Artifact Traceability Graphs
Emre Sülün, Eray Tüzün, Uğur Doğrusöz
Information and Software Technology, 2020

Context:
Various types of artifacts (requirements, source code, test cases, documents, etc.) are produced throughout the lifecycle of a software. These artifacts are connected with each other via traceability links that are stored in modern application lifecycle management repositories. Throughout the lifecycle of a software, various types of changes can arise in any one of these artifacts. It is important to review such changes to minimize their potential negative impacts. To make sure the review is conducted properly, the reviewer(s) should be chosen appropriately.
Objective:
We previously introduced a novel approach, named RSTrace, to automatically recommend reviewers that are best suited based on their familiarity with a given artifact. In this study, we introduce an advanced version of RSTrace, named RSTrace+ that accounts for recency information of traceability links including practical tool support for GitHub.
Methods:
In this study, we conducted a series of experiments on finding the appropriate code reviewer(s) using RSTrace+ and provided a comparison with the other code reviewer recommendation approaches.
Results:
We had initially tested RSTrace+ on an open source project (Qt 3D Studio) and achieved a top-3 accuracy of 0.89 with an MRR (mean reciprocal ranking) of 0.81. In a further empirical evaluation of 40 open source projects, we compared RSTrace+ with Naive-Bayes, RevFinder and Profile based approach, and observed higher accuracies on the average.
Conclusion:
We confirmed that the proposed reviewer recommendation approach yields promising top-k and MRR scores on the average compared to the existing reviewer recommendation approaches. Unlike other code reviewer recommendation approaches, RSTrace+ is not limited to recommending reviewers for source code artifacts and can potentially be used for recommending reviewers for other types of artifacts. Our approach can also visualize the affected artifacts and help the developer to make assessments of the potential impacts of change to the reviewed artifact.

CRSG: A Serious Game for Teaching Code Review
Kaan Ünlü, Barış Ardıç, Eray Tüzün
ESEC/FSE 2020 Tool Demonstrations Track, 2020

The application of code review in a development environment is essential, but this skill is not taught very often in an educational context despite its wide usage. To streamline the teaching process of code review, we propose a browser based "Code Review Serious Game" (CRSG) with high accessibility, progressive level difficulty and an evolvable foundation for prospective improvements or changes. The application is built as a serious game to reinforce the learning experience of its users by immersing them in its story and theme, helping them learn while having fun. The effectiveness of the game components are measured with a case study of 132 students of 2 software engineering courses. The promising result of this case study suggests CRSG can indeed be used effectively to teach code review. The demo video for the game can be accessed at https://youtu.be/FLnr3p4bhOg, and CRSG itself at: https://github.com/barisardic/crsg.

Identifying Key Developers using Artifact Traceability Graphs
H.Alperen Çetin, Eray Tüzün
16th International Conference on Predictive Modeling in Software Engineering (PROMISE 2020)

Developers are the most important resource to build and maintain software projects. Due to various reasons, some developers take more responsibility, and this type of developers are more valuable and indispensable for the project. Without them, the success of the project would be at risk. We use the term key developers for these essential and valuable developers, and identifying them is a crucial task for managerial decisions such as risk assessment for potential developer resignations. We study key developers under three categories: jacks, mavens and connectors. A typical jack (of all trades) has a broad knowledge of the project, they are familiar with different parts of the source code, whereas mavens represent the developers who are the sole experts in specific parts of the projects. Connectors are the developers who involve different groups of developers or teams. They are like bridges between teams. To identify key developers in a software project, we propose to use traceable links among software artifacts such as the links between change sets and files. First, we build an artifact traceability graph, then we define various metrics to find key developers. We conduct experiments on three open source projects: Hadoop, Hive and Pig. To validate our approach, we use developer comments in issue tracking systems and demonstrate that the identified key developers by our approach match the top commenters up to 92%.

Creation of a Serious Game For Teaching Code Review: An Experience Report
Barış Ardıç, İrem Yurdakul, Eray Tüzün
32nd IEEE International Conference on Software Engineering Education & Training (CSEE&T 2020)

Code review, a manual inspection of source code by developers other than the author, is a frequently used practice for improving code quality in the software development life-cycle. Employing a tool-based review of code changes has become the norm for a wide variety of open source and industrial systems. Despite its widespread usage and importance, software development practices such as code review are often not addressed in typical Software Engineering education. To address this knowledge gap, we propose to use a serious game approach for teaching code review practices. In this study, we define our learning objectives and design a code review serious game along with its companion quizzes. Then we conduct a small preliminary experiment in order to procure feedback. Using the results of the experiment and participant interviews, we improve our game prototype for integration into a software engineering course while optimizing the initial experiment for student's benefit. We document the process, lessons learned and the future directions of the game. The results we gather indicate that the game is ready to be used in a software engineering course setting.

Understanding the knowledge gaps of software engineers: An empirical analysis based on SWEBOK
Vahid Garouisi, Görkem Giray, Eray Tüzün
ACM Transactions on Computing Education, 2019

Context:
Knowledge level and productivity of the software engineering (SE) workforce are the subject of regular discussions among practitioners, educators, and researchers. There have been many efforts to measure and improve the knowledge gap between SE education and industrial needs.
Objective:
Although the existing efforts for aligning SE education and industrial needs have provided valuable insights, there is a need for analyzing the SE topics in a more “fine-grained” manner; i.e., knowing that SE university graduates should know more about requirements engineering is important, but it is more valuable to know the exact topics of requirements engineering that are most important in the industry.
Method:
We achieve the above objective by assessing the knowledge gaps of software engineers by designing and executing an opinion survey on levels of knowledge learned in universities versus skills needed in industry. We designed the survey by using the SE knowledge areas (KAs) from the latest version of the Software Engineering Body of Knowledge (SWEBOK v3), which classifies the SE knowledge into 12 KAs, which are themselves broken down into 67 subareas (sub-KAs) in total. Our analysis is based on (opinion) data gathered from 129 practitioners, who are mostly based in Turkey.
Results:
Based on our findings, we recommend that educators should include more materials on software maintenance, software configuration management, and testing in their SE curriculum. Based on the literature as well as the current trends in industry, we provide actionable suggestions to improve SE curriculum to decrease the knowledge gap.

Investigating the Validity of Ground Truth in Code Reviewer Recommendation Studies (Best Emerging Results and Vision Paper Award)
Emre Doğan, Eray Tüzün, K. Ayberk Tecimer and H. Altay Güvenir
13th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM 2019)

Background: Selecting the ideal code reviewer in modern code review is a crucial first step to perform effective code reviews. There are several algorithms proposed in the literature for recommending the ideal code reviewer for a given pull request. The success of these code reviewer recommendation algorithms is measured by comparing the recommended reviewers with the ground truth that is the assigned reviewers selected in real life. However, in practice, the assigned reviewer may not be the ideal reviewer for a given pull request.Aims: In this study, we investigate the validity of ground truth data in code reviewer recommendation studies.Method: By conducting an informal literature review, we compared the reviewer selection heuristics in real life and the algorithms used in recommendation models. We further support our claims by using empirical data from code reviewer recommendation studies.Results: By literature review, and accompanying empirical data, we show that ground truth data used in code reviewer recommendation studies is potentially problematic. This reduces the validity of the code reviewer datasets and the reviewer recommendation studies. Conclusion: We demonstrated the cases where the ground truth in code reviewer recommendation studies are invalid and discussed the potential solutions to address this issue.

Reviewer Recommendation Using Software Artifact Traceability Graphs
Emre Sülün, Eray Tüzün, Uğur Doğrusöz
15th International Conference on Predictive Models and Data Analytics in Software Engineering - PROMISE '19, 2019

Various types of artifacts (requirements, source code, test cases, documents, etc.) are produced throughout the lifecycle of a software. These artifacts are often related with each other via traceability links that are stored in modern application lifecycle management repositories. Throughout the lifecycle of a software, various types of changes can arise in any one of these artifacts. It is important to review such changes to minimize their potential negative impacts. To maximize benefits of the review process, the reviewer(s) should be chosen appropriately. In this study, we reformulate the reviewer suggestion problem using software artifact traceability graphs. We introduce a novel approach, named RSTrace, to automatically recommend reviewers that are best suited based on their familiarity with a given artifact. The proposed approach, in theory, could be applied to all types of artifacts. For the purpose of this study, we focused on the source code artifact and conducted an experiment on finding the appropriate code reviewer(s). We initially tested RSTrace on an open source project and achieved top-3 recall of 0.85 with an MRR (mean reciprocal ranking) of 0.73. In a further empirical evaluation of 37 open source projects, we confirmed that the proposed reviewer recommendation approach yields promising top-k and MRR scores on the average compared to the existing reviewer recommendation approaches.

Identifying the Most Valuable Developers Using Artifact Traceability Graphs
H. Alperen Cetin
In Proceedings of The 27th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2019)

Finding the most valuable and indispensable developers is a crucial task in software development. We categorize these valuable developers into two categories: connector and maven. A typical connector represents a developer who connects different groups of developers in a large-scale project. Mavens represent the developers who are the sole experts in specific modules of the project. To identify the connectors and mavens, we propose an approach using graph centrality metrics and connections of traceability graphs. We conducted a preliminary study on this approach by using two open source projects: QT 3D Studio and Android. Initial results show that the approach leads to identify the essential developers.

Suggesting Reviewers of Software Artifacts Using Traceability Graphs
Emre Sülün
In Proceedings of The 27th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2019)

During the lifecycle of a software project, software artifacts constantly change. A change should be peer-reviewed to ensure the software quality. To maximize the benefit of review, the reviewer(s) should be chosen appropriately. However, choosing the right reviewer(s) might not be trivial especially in large projects. Researchers developed different methods to recommend reviewers. In this study, we introduce a novel approach for reviewer recommendation problem. Our approach utilizes the traceability graph of a software project and assigns a know-about score to each developer, then recommends the developers who have the maximum know-about score for an artifact. We tested our approach on an open source project and achieved top-3 recall of 0.85 with an MRR (mean reciprocal ranking) of 0.73.

Aligning software engineering education with industrial needs: a meta-analysis
Vahid Garousi, Görkem Giray, Eray Tüzün, Cagatay Catal, Michael Felderer
Journal of Systems and Software,2019

Context
According to various reports, many software engineering (SE) graduates often face difficulties when beginning their careers, which is mainly due to misalignment of the skills learned in university education with what is needed in the software industry.
Objective
Our objective is to perform a meta-analysis to aggregate the results of the studies published in this area to provide a consolidated view on how to align SE education with industry needs, to identify the most important skills and also existing knowledge gaps.
Method
To synthesize the body of knowledge, we performed a systematic literature review (SLR), in which we systematically selected a pool of 35 studies and then conducted a meta-analysis using data extracted from those studies.
Results
Via a meta-analysis and using data from 13 countries and over 4,000 data points, highlights of the SLR include: (1) software requirements, design, and testing are the most important skills; and (2) the greatest knowledge gaps are in configuration management, SE models and methods, SE process, design (and architecture), as well as in testing.
Conclusion
This paper provides implications for both educators and hiring managers by listing the most important SE skills and the knowledge gaps in the industry.

Catching up with Method and Process Practice: A new Baseline for Researchers
HELENA Consurtium
International Conference on Software Engineering in Practice, 2019

Software development methods are usually not applied by the book. Companies are under pressure to continuously deploy software products that meet market needs and stakeholders’ requests. To implement efficient and effective development processes, companies utilize multiple frameworks, methods and practices, and combine these into hybrid methods. A common combination contains a rich management framework to organize and steer projects complemented with a number of smaller practices providing the development teams with tools to complete their tasks. In this paper, based on 732 data points collected through an international survey, we study the software development process use in practice. Our results show that 76.8% of the companies implement hybrid methods. Company size as well as the strategy in devising and evolving hybrid methods affect the suitability of the chosen process to reach company or project goals. Our findings show that companies that combine planned improvement programs with process evolution can increase their process’ suitability by up to 5%.

Adopting Integrated Application Lifecycle Management within a Large-Scale Software Company: An Action Research Approach
Eray Tuzun, Bedir Tekinerdogan, Yagup Macit, Kursat Ince
Journal of Systems and Software, 2018

Context
Application Lifecycle Management (ALM) is a paradigm for integrating and managing the various activities related to the governance, development and maintenance of software products. In the last decade, several ALM tools have been proposed to support this process, and an increasing number of companies have started to adopt ALM.
Objective
We aim to investigate the impact of adopting ALM in a real industrial context to understand and justify both the benefits and obstacles of applying integrated ALM.
Method
As a research methodology, we apply action research that we have carried out within HAVELSAN, a large-scale IT company. The research was carried out over a period of seven years starting in 2010 when the ALM initiative has been started in the company to increase productivity and decrease maintenance costs.
Results
The paper presents the results of the action research that includes the application of ALM practices. The transitions among the different steps are discussed in detail, together with the identified obstacles, benefits and lessons learned.
Conclusions
Our seven-year study shows that the adoption of ALM processes is not trivial and its success is related to many factors. An important conclusion is that a piecemeal solution as provided by ALM 1.0 is not feasible for the complex process and tool integration problems of large enterprises. Hence the transition to ALM 2.0 was found necessary to cope with the organizational and business needs. Although ALM 2.0 appeared to be a more mature ALM approach, there are still obstacles that need attention from both researchers and practitioners.

An Auction-Based Serious Game for Bug Tracking
Cagdas Usfekes, Eray Tuzun, Murat Yılmaz, Yagup Macit, Paul Clarke
IET Software, 2019

Today, one of the challenges in software engineering is utilising application lifecycle management (ALM) tools effectively in software development. In particular, it is hard for software developers to engage with the work items that are appointed to themselves in these ALM tools. In this study, the authors have focused on bug tracking in ALM where one of the most important metrics is mean time to resolution that is the average time to fix a reported bug. To improve this metric, they developed a serious game application based on an auction-based reward mechanism. The ultimate aim of this approach is to create an incentive structure for software practitioners to find and resolved bugs that are auctioned where participants are encouraged to solve and test more bugs in less time and improve quality of software development in a competitive environment. They conduct hypothesis tests by performing a Monte Carlo simulation. The preliminary results of this research support the idea that using a gamification approach for an issue tracking system enhances the productivity and decreases mean time to resolution.

Closing the gap between software engineering education and industrial needs
Vahid Garousi, Görkem Giray, Eray Tüzün, Cagatay Catal, Michael Felderer
IEEE Software

Many recent software engineering graduates often face difficulties when beginning their professional careers, due to misalignment of the skills learned in their university education with what is needed in industry. In this article, we report a literature review of the studies that have been done to make improvements on this issue.

Adopting Augmented Reality for the Purpose of Software Development Process Training and Improvement: An Exploration
İpek Ohri, İrem Öge, Bora Orkun, Murat Yılmaz, Eray Tüzün, Paul Clarke, RV O’Connor
European Conference on Software Process Improvement, 195-206

Augmented reality (AR) is a technological field of study that bridges the physical and digital world together with a view to improving user experience. AR holds great potential to change the delivery of software services or software process improvement by utilizing a specific set of components. The purpose of this exploratory study is to propose an integration framework to support AR for improving the onboarding process, notably in introducing new hires to the development process while performing their daily tasks. In addition, it also aims to enhance the software development workflow process using AR. Similar to a GPS device that can guide you from point A to point B, our goal is to create software artifacts like navigation components where software teams may benefit from digitally enhanced working conditions provided using AR. After conducting a review in the literature, we confirmed that there is lack of studies about the combination of augmented reality with software engineering disciplines for onboarding. In this paper, we formalized our approach based on the benefits of AR. Ultimately; we propose an AR-based preliminary model for improving the software development process.

IoT System Development Methods
Görkem Giray, Bedir Tekinerdogan, Eray Tüzün
Internet of Things: Challenges, Advances and Applications, CRC Press, 2018 (Book Chapter)

This chapter presents an overview of system development methods (SDMs) dedicated for Internet of Things (IoT) systems. It summarizes IoT SDMs along with their process flows represented using Business Process Model and Notation. The chapter presents the characterization of these methods using the evaluation criteria. The objective of an IoT SDM is to guide a project team in developing and combining these components in order to be able to fulfill user requirements. SDM is preferred over the "software development method" concept since an IoT system encompasses many software, hardware, and communication components. An important aspect in IoT is that changes in the properties of a thing and its corresponding virtual entity need to be synchronized. The IoT Methodology is a generic, lightweight method built on iterative prototyping and Lean start-up approaches. Ignite provides guidance for developing IoT services and software components for IoT devices from a business and project management perspective; however, it does not provide a low-level, technical guidance.

Adopting the Essence Framework to Derive a Practice Library for the Development of IoT Systems
Görkem Giray, Bedir Tekinerdogan, Eray Tüzün
Connected Environments for the Internet of Things, Challenges and Solutions, Springer International Publishing, 2018 (Book Chapter)

The Internet of Things (IoT) is a global network of smart devices which enables these objects to collect and exchange data. Research in the IoT is still progressing, and it is now being applied in various domains. One of the key observations is that the development of IoT systems is not trivial and needs to be carefully managed to meet the required functional and quality concerns. Due to the heterogeneous aspects including software, hardware, and communication, developing the IoT systems implies various challenges that need to be explicitly considered in the development process and successfully resolved. Unfortunately, less focus has been provided so far on the development methods for the IoT systems. To address the particular IoT development concerns, we analyze and discuss the existing approaches that target the development of IoT systems. For this purpose, we use the Essence Framework, which has been recently developed as a framework for modeling various kinds of software development practices and methods. We propose an initial practice library, which can be used to develop and/or tailor project-specific IoT system development methods.

Modeling Software Product Line Engineering with Essence Framework
Eray Tüzün, Görkem Giray, Bedir Tekinerdogan, Yagup Macit
International Journal of Informatics Technologies, 2018

Although several software product line engineering (SPLE) methods have been described in the literature, adopting these methods in practice is often not straightforward. Thorough understanding of the methods and their artefacts is necessary to apply the methods in a proper manner, and likewise realize the expected goals of SPLE. Recently the Essence framework has been proposed to model the essential elements of a method and to support the modeling of a broad set of software development methods including plan-driven methods and agile methods. So far, the Essence framework has been applied to single system development methods and not yet for SPLE methods. To enhance the understanding of SPLE methods and support a vision for tailoring SPLE methods, we provide a mapping of an SPLE method to the Essence framework. We present experiences about modeling an SPLE method using the Essence framework within the industrial context of Havelsan.

Are Computer Science and Engineering Graduates Ready for the Software Industry? Experiences from an Industrial Student Training Program
Eray Tuzun, Hakan Erdogmus and Izzet Gokhan Ozbilgin
International Conference in Software Engineering SEET 2018 (Acceptance rate <25%)

It has been 50 years since the term “software engineering” was coined in 1968 at a NATO conference. The field should be relatively mature by now, with most established universities coveringcore software engineering topics in their Computer Science programs and others offering specialized degrees. However, still many practitioners lament a lack of skills in new software engineering hires. With the growing demand for software engineers from the industry, this apparent gap becomes more and more pronounced. One corporate strategy to address this gap is for the industry to develop supplementary training programs before the hiring process, which could also help companiesscreen viable candidates.In this paper, we report on our experiences and lessons learned in conducting a summer school program aimed at screening new graduates, introducing them to core skills relevant to the organization and industry, and assessing their attitudes toward mastering those skills before the hiring process begins. Our experiencesuggests that such initiatives can be mutually beneficial for new hires and companiesalike. We support this insight with pre-and post-training data collected from the participants during the first edition of thesummer school and a follow-up questionnaire conducted after a year with the participants, 50% of whom werehired by the company shortly after the summer school.