George K. Thiruvathukal
Title/s: Professor of Computer Science
Director of CS Department Computing, Visiting Faculty at Argonne National Laboratory
Specialty Area: high performance & distributed computing, cyber-physical systems, software engineering, programming languages and systems, history of computing, computational and data science, computing education, and ethical/legal/social issues in CS.
Office #: Doyle 301
External Webpage: https://thiruvathukal.com/
George K. Thiruvathukal holds PhD (1995) and MS (1990) in Computer Science from Illinois Institute of Technology and a BA (1988) in Computer Science and Physics with a Mathematics Minor from Lewis University in Romeoville, IL. He is Professor of Computer Science at Loyola University Chicago and Visiting Faculty at Argonne National Laboratory.
For a more detailed bio-sketch, see Dr. Thiruvathukal's website, thiruvathukal.com.
For a list of publications, see Dr. Thiruvathukal's Digital Commons page, works.bepress.com/gkthiruvathukal/.
PublicationsExercises Integrating High School Mathematics with Robot Motion Planning
This paper presents progress in developing exercises for high school students incorporating level-appropriate mathematics into robotics activities. We assume mathematical foundations ranging from algebra to precalculus, whereas most prior work on integrating mathematics into robotics uses only very elementary mathematical reasoning or, at the other extreme, is comprised of technical papers or books using calculus and other advanced mathematics. The exercises suggested are relevant to any differerential-drive robot, which is an appropriate model for many different varieties of educational robots. They guide students towards comparing a variety of natural navigational strategies making use of typical movement primitives. The exercises align with Common Core State Standards for Mathematics.
In testing stateful abstractions, it is often necessary to record interactions, such as method invocations, and express assertions over these interactions. Following the Test Spy design pattern, we can reify such interactions programmatically through additional mutable state. Alternatively, a mocking framework, such as Mockito, can automatically generate test spies that allow us to record the interactions and express our expectations in a declarative domain-specific language. According to our study of the test code for Scala’s Iterator trait, the latter approach can lead to a significant reduction of test code complexity in terms of metrics such as code size (in some cases over 70% smaller), cyclomatic complexity, and amount of additional mutable state required. In this tools paper, we argue that the resulting test code is not only more maintainable, readable, and intentional, but also a better stylistic match for the Scala community than manually implemented, explicitly stateful test spies.
Many computational theories have been developed to improve artificial phonetic classification performance from linguistic auditory streams. However, less attention has been given to psycholinguistic data and neurophysiological features recently found in cortical tissue. We focus on a context in which basic linguistic units–such as phonemes–are extracted and robustly classified by humans and other animals from complex acoustic streams in speech data. We are especially motivated by the fact that 8-month-old human infants can accomplish segmentation of words from fluent audio streams based exclusively on the statistical relationships between neighboring speech sounds without any kind of supervision. In this paper, we introduce a biologically inspired and fully unsupervised neurocomputational approach that incorporates key neurophysiological and anatomical cortical properties, including columnar organization, spontaneous micro-columnar formation, adaptation to contextual activations and Sparse Distributed Representations (SDRs) produced by means of partial N-Methyl-D-aspartic acid (NMDA) depolarization. Its feature abstraction capabilities show promising phonetic invariance and generalization attributes. Our model improves the performance of a Support Vector Machine (SVM) classifier for monosyllabic, disyllabic and trisyllabic word classification tasks in the presence of environmental disturbances such as white noise, reverberation, and pitch and voice variations. Furthermore, our approach emphasizes potential self-organizing cortical principles achieving improvement without any kind of optimization guidance which could minimize hypothetical loss functions by means of–for example–backpropagation. Thus, our computational model outperforms multiresolution spectro-temporal auditory feature representations using only the statistical sequential structure immerse in the phonotactic rules of the input stream.
As dataset sizes increase, data analysis tasks in high performance computing (HPC) are increasingly dependent on sophisticated dataflows and out-of-core methods for efficient system utilization. In addition, as HPC systems grow, memory access and data sharing are becoming performance bottlenecks. Cloud computing employs a data processing paradigm typically built on a loosely connected group of low-cost computing nodes without relying upon shared storage and/or memory. Apache Spark is a popular engine for large-scale data analysis in the cloud, which we have successfully deployed via job submission scripts on production clusters.
In this paper, we describe common parallel analysis dataflows for both Message Passing Interface (MPI) and cloud based applications. We developed an effective benchmark to measure the performance characteristics of these tasks using both types of systems, specifically comparing MPI/C-based analyses with Spark. The benchmark is a data processing pipeline representative of a typical analytics framework implemented using map-reduce. In the case of Spark, we also consider whether language plays a role by writing tests using both Python and Scala, a language built on the Java Virtual Machine (JVM). We include performance results from two large systems at Argonne National Laboratory including Theta, a Cray XC40 supercomputer on which our experiments run with 65,536 cores (1024 nodes with 64 cores each). The results of our experiments are discussed in the context of their applicability to future HPC architectures. Beyond understanding performance, our work demonstrates that technologies such as Spark, while typically aimed at multi-tenant cloud-based environments, show promise for data analysis needs in a traditional clustering/supercomputing environment.
Background: Developers face challenges in building high-quality research software due to its inherent complexity. These challenges can reduce the confidence users have in the quality of the result produced by the software. Use of a defined software development process, which divides the development into distinct phases, results in improved design, more trustworthy results, and better project management. Aims: This paper focuses on gaining a better understanding of the use of software development process for research software. Method: We surveyed research software developers to collect information about their use of software development processes. We analyze whether and demographic factors influence the respondents' use of and perceived value in defined process. Results: Based on 98 responses, research software developers appear to follow a defined software development process at least some of the time. The respondents also have a strong positive perception about the value of following processes. Conclusions: To produce high-quality and reliable research software, which is critical for many research domains, research software developers must follow a proper software development process. The results indicate a positive perception of value about using defined development processes that should lead to both short-term benefits through improved results and long-term benefits through more maintainable software.