当前位置:首页 >> 经济学 >>

Practical, AutomatedLarge ScaleSoftware Reengineering


Practical, Automated Large Scale Software Reengineering with DMS
Jing Zhang Slides from Dr.Ira D. Baxter
Semantic Designs, Inc. www.semanticdesigns.com

Feb.24th, 2004
? Semantic Designs, Inc.
2011-3-26 1

Modern Software Engineering
? ? ? ? ? Large software system in multiple languages 80% Maintenance/Enhancements Little accurate design documentation Largely manual effort How to:
– – – – Understand software structure? Reorganize structure to enable change? Make sweeping changes? Make reliable changes?

? Semantic Designs, Inc.

2011-3-26

2

DMS? Software Reengineering Toolkit
? Customized, automated analysis, modification, porting or generation
– For sources for large scale software systems
? Scalable to millions of source lines, tens of thousands of files ? Parallel processing foundations to support scale

– Handles many and mixed languages simultaneously
? C, C++, Java, COBOL, HTML, Ada, Fortran, SQL, XML, assembler, …

– Generalized compiler technology conveniently integrated
? Parsing, Analyzing, Transforming, Prettyprinting ? Enables practical customization for desired automation task ? Predefined support for standard computer languages

? Semantic Designs, Inc.

2011-3-26

3

Automated! Repeatable! Scalable! Software System Sources (MSLOC) example
… a in b… if a in c then p=7 else p=2 ;

DMS Concept
DMS

Analyses of Software System
a in b b in c a in c

Revised/Generated Software Sources
… a in b… p=7;

Captured!

Background Knowledge for Software Analysis or Modification problem (KSLOC)
?Language Definitions ?General Language Analyses ?General Language Transforms
Ada, SQL, Java … ”var=expression” => “var modified” ”if true then s else t endif -> s” “x in y and y in z” => “x in z” “b in c”

?DSL Language Analyses ?DSL Application Context Knowledge

from Semantic Designs or Consultants/Sr. Programmers/Engineers

? Semantic Designs, Inc.

2011-3-26

4

Overview
? DMS? Software Reengineering Toolkit
– Defining notations (“domains”) for specs and legacy systems – Parsing and prettyprint – Transformation rule mechanics

? Applications for Software Quality Improvement
– – – – – – – – C++ preprocessor conditional removal Software Test Coverage Cross Reference and Dead Code Refactoring Java Automatic Code Generation (XML Parsers) Clone Detection/Removal Porting application software to new languages Restructuring Legacy Applications ? Semantic Designs, Inc.
2011-3-26 5

DMS Domain Parts
? Notation
– – – – External Form (what you can say: string or graphical) Internal Form (How DMS stores it) Parser (how to convert external form to internal form) PrettyPrinter (how to display the Internal Form)
(how to optimize in the domain) (how to transform IF to another IF) (how to analyze in the domain)

? Semantics
– Optimizations – Refinements – Analyzers

? Semantic Designs, Inc.

2011-3-26

6

DMS Domain for Java Parser + Pretty Printer
nested_class_declaration = nested_class_modifiers class_header class_body ; <<PrettyPrinter>>: { V(H(nested_class_modifiers,class_header),class_body); } class_header = 'class' IDENTIFIER ; <<PrettyPrinter>>: { H('class',IDENTIFIER); } class_header = 'class' IDENTIFIER 'implements' name_list ; <<PrettyPrinter>>: { H('class',IDENTIFIER,'implements',name_list); } class_header = 'class' IDENTIFIER 'extends' name; <<PrettyPrinter>>: { H('class',IDENTIFIER,'extends',name); } class_header = 'class' IDENTIFIER 'extends' name 'implements' name_list ; <<PrettyPrinter>>: { H('class',IDENTIFIER,'extends',name,'implements',name_list); } class_body = '{' class_body_declarations '}' ; <<PrettyPrinter>>: { V(H('{',STRING(" "),class_body_declarations),'}'); } nested_class_modifiers = nested_class_modifiers nested_class_modifier ; <<PrettyPrinter>>: { H(CH(nested_class_modifiers[1]),nested_class_modifier); }

… + 300 more rules…(COBOL is 3500!)

? Semantic Designs, Inc.

2011-3-26

7

Parsing to Abstract Syntax Trees
A Program Representation analyzable by Computers

? Use DMS grammar domain to define language syntax ? DMS generates lexer/parser automatically ? Parser reads source file(s)
– – – – Captures comments Carries out lexical conversions (e.g, FP text -> IEEE binary fp) Builds Abstract Syntax Tree Records Position of every node (file, line, col)

? Present capability for the following domains
– Specification: Spectrum, BNF, Rose Models – Technology: XML, IDL, SQL – Implementation: C/C++, COBOL, Java, Ada, VB6, Fortran, Verilog ? Semantic Designs, Inc.
2011-3-26 8

A Simple Java Program
001 002 003 004 005 006 007 008 009 010 /* Fib.java */ public class NumberTheory { int Fib(int x) { if (x < 1) return 1; // base case else return Fib(x-1)+Fib(x-2); } }

? Semantic Designs, Inc.

2011-3-26

9

Abstract Syntax Tree (AST) for Fib Class
… free of lexical properties (‘text shape’) of program ...
Class Header
/* Fib.java */ public class NumberTheory { int Fib(int x) { if (x < 1) return 1; // base case else return Fib(x-1)+Fib(x-2); Block } }

ID `Number Theory’

Class Body

Return

Method Declaration

Stmt Sequence

+

Method Modifiers

ID `Fib`

Parameters

Empty Throwlist

If Then Else

Function Call

Function Call

Type INT

Parameter

<

Return

ID `Fib’

-

ID `Fib’

-

Type INT Not shown: File/line/column annotation on each node

ID `x’

ID `x’

NUMBER 1

NUMBER 1

ID `x’

NUMBER 1

ID `x’

NUMBER 2

? Semantic Designs, Inc.

2011-3-26

10

PrettyPrinting: “AntiParsing”
? Conversion of AST back to text file ? Handles indentation, comments, literal formats... ? Uses DMS Box language to compose PP fragments
V(H(‘if’,’(‘,condition,’)’), I(then_stmt));
If Then

H(expression1,’<‘,expression2); H(‘return’,expression,’;’);

<

Return

Prettyprinted result:
ID `x’ NUMBER 1 NUMBER 1

if (x<1) return 1;
2011-3-26 11

? Semantic Designs, Inc.

Optimization transform for DMS Rewrite Rule Language
default base domain Java; Domain Name rule merge-ifs(\condition1, \condition2, \then-statements) “if (\condition1) if (\condition2) { \then-statements } ” rewrites to “if (\condition1 && \condition2) { \then-statements } ”;

Domain Syntax

? Semantic Designs, Inc.

2011-3-26

12

DMS transforms work on ASTs, not text
Not fooled by any lexical properties of text! To modify programs: 1) define transforms 2) Parse program 3) Apply transforms a) match LHS pattern b) replace with RHS substitution 4) Prettyprint program
If Then

rewrites -to

If Then

Left hand side

Right hand side

\condition1

If Then

&&

\then statements

\condition2

\then statements

\condition1

\condition2

? Semantic Designs, Inc.

2011-3-26

13

Overview
? DMS? Software Reengineering Toolkit
– Defining notations (“domains”) for specs and legacy systems – Parsing and prettyprint – Transformation mechanics

? Applications for Software Quality Improvement
– – – – – – – – C++ preprocessor conditional removal Software Test Coverage Cross Reference and Dead Code Refactoring Java Automatic Code Generation (XML Parsers) Fast HTML generation using XSLT Clone Detection/Removal Porting application software to new languages ? Semantic Designs, Inc.
2011-3-26 14

Software Test Coverage
? Analysis of code executed by test cases
– Non-executed code likely to have flaws

? Key problem: tracking program control flow
– Need way to identify possible program parts – Capture “executed” status of parts via tests – Display execution status of program parts

? Secondary problem: exercising all parts
– Exercising individual part – Generation of tests from specifications
? Semantic Designs, Inc.

2011-3-26

15

Test Coverage by Marking visited Blocks
bool fibcached[1000]; int fibvalue[1000]; int fib(int i) { int t; switch (i) { case 0: case 1: return 1; default: if fibcached(i) return fibvalue(i); else { t=fib(i-1); return t+fib(i-2); }; }; }; bool fibcached[1000]; int fibvalue[1000]; int fib(int i) { int t; visited[1]=true; switch (i) { case 0: visited[2]=true; case 1: visited[3]=true; return 1; default: visited[4]=true; if fibcached(i) { visited[5]=true; return fibvalue(i);} else { visited[6]=true; t=fib(i-1); return t+fib(i-2); }; }; };

Original “C” program ? Semantic Designs, Inc.

Marking program
2011-3-26

16

DMS transform(s) to mark program
default base domain C; rule mark_function_entry(result:type, name:identifier, decls:declaration_list, stmts:statement_sequence) = “\result \name { \decls \stmts };” rewrites to “\result \name { \decls { visited[\place\(\stmts\)]=true; \stmts };”. rule mark_if_then_else(condition:expression; tstmt:statement; estmt:statement) = “if (\condition)\tstmt else \estmt;” rewrites to “if (\condition) { visited[\place\(\tstmt\)]=true; \tstmt} else {visited[\place\(\estmt\)]=true; \estmt};”. rule mark_while_loop(condition:expression, stmt:statement) = “while (\condition) \stmt” rewrites to “while (\condition) { visited[\place\(\stmts\)=true; \stmt }”. rule mark_case_clause(e:expression, stmts:statements) = “case \e: \stmts” rewrites to “case \e: { visited[\place\(\stmts\)=true; \stmts }”.

? Semantic Designs, Inc.

2011-3-26

17

Test Coverage Tool Flow
Source line information for visited[i]

Source Code

DMS: Add marking code

Decorated Code

Compile & Run tests

visited

Vector

Display Coverage

Visit-adding Transforms

Test Data

? Semantic Designs, Inc.

2011-3-26

18

Typical Porting Scenarios
? JOVIAL73 on MIL1750 ? COBOL74 + IDMS C on PowerPC
– Military Avionics + Weapons management

COBOL85 + SQL ANSI C + VXworks Delphi + GUI

– UNISYS 1100 retirement; must move data, too!

? K&R C + custom RTOS ? Clipper + green screen ? MODCOMP ASM ? Verilog VHDL C

– Microprocessor modernization – Legacy 3GL data processing language – Defense Radar modernization; 12 computer languages! – Reuse of Chip Design in new context ? Semantic Designs, Inc.
2011-3-26 19

Porting Process
Continuing Application Development

Code 1 M SLOC DMS

Run DMS Port
transforms

Ported Code

Test Ported System

Success

Analyze Existing Software
DMS Engineer

Develop Porting Transforms

Errors Repeated Port Cycles No Impact On Development!

? Semantic Designs, Inc.

2011-3-26

20

A few DMS porting transforms
Jovial to C
default source domain Jovial; default target domain C; Domain Name

private rule refine_data_reference_dereference_NAME (n1:identifier@C,n2:identifier@C) :data_reference->expression = "\n1\:NAME @ \n2\:NAME" -> "\n2->\n1". private rule refine_for_loop_letter_2 (lc:identifier@C,f1:expression@C, f2:expression@C,s:statement@C) :statement->statement = "FOR \lc\:loop_control : \f1\:formula BY \f2\:formula; \s\:statement“ -> "{ int \lc = (\f1); Target Domain Syntax for(;;\lc += (\f2)) { \s } }“ if is_letter_identifier(lc).

? Semantic Designs, Inc.

2011-3-26

21

Porting Transforms in Action
Jovial to C
JOVIAL Source: FOR i: j*3 BY 2 ; x@mydata = x@mydata+I;

Translated C Result: { int i = j*3; for (;;i+=2) { mydata->x = mydata->x + i}

Typically lots of small transforms for full translation ~1500 rules to translate full Jovial ? Semantic Designs, Inc.
2011-3-26 22

A More Complex Example
Jovial to C
START TABLE TFP'D'TWRDET (1:109,12:37); BEGIN % Main status boolean % ITEM TFP'G'TWRDET STATUS (V(YES),V(NO)); END TYPE TFP'D'TWRDET'TABLE TABLE (7:23) W 3; BEGIN ITEM TFP'ITM S 3 POS(0,3); "cube axis" END %begin proc% PROC PROC'A(c1) S; BEGIN ITEM match'count U 6; %an item% ITEM c1 C 5; "parameter value" ITEM c2 C 7; IF c1 <= c2 AND c2 > c1; match'count = UBOUND(TFP'D'TWRDET,0) + UBOUND(TFP'D'TWRDET'TABLE,0); "result off by 1 so adjust" match'count = match'count+1; BEGIN match'count=match'count/2; PROC'A = match'count; % return answer % END "cleanup and exit"; END "end proc" TERM #include "jovial.h" static struct { /* Main status boolean */ enum { V(yes$OF$tfp_g_twrdet$OF$tfp_d_twrdet), V(no$OF$tfp_g_twrdet$OF$tfp_d_twrdet) } tfp_g_twrdet _size_as_word; } tfp_d_twrdet[109][26]; typedef union { W(3); struct { POS(0, 3) S(3) tfp_itm:4 _align_to_bit; /* cube axis */ }; } tfp_d_twrdet_table[17]; static S proc_a(C(5) c1); /* begin proc */ static S proc_a(C(5) c1) { __typeof__(proc_a(c1)) RESULT(proc_a); _main: { U(6) match_count; C(7) c2; if (CHARACTER_COMPARE(BYTE_CONVERT(C(7), c1, 7), c2) <= 0 && CHARACTER_COMPARE(c2, BYTE_CONVERT(C(7), c1, 7)) > 0) match_count = UBOUND(tfp_d_twrdet, 2, 0) + 16; /* result off by 1 so adjust */ match_count = (S(6))match_count + 1; { match_count = (S(6))match_count / 2; RESULT(proc_a) = (S(6))match_count; /* return answer */ } /* cleanup and exit */ ; } _return: return RESULT(proc_a); } /* end proc */

packed tables with bit offsets, typedefs, functions, string operations, comments

Equivalent C (used with hand-coded macro library) 2011-3-26 23

? Semantic Designs, Inc.

DMS: Conclusion
? Useful to automate analysis/modification of programs
– Many possible custom reengineering possibilities – A key technology for software quality improvement

? Need generalized compiler-like infrastructure
– Definable parsers, prettyprinters, transforms – Must scale to application systems with MSLOCs www.semanticdesigns.com/Products/DMS/DMSToolkit.html WhyDMSForSoftwareQuality.pdf

? Semantic Designs, Inc.

2011-3-26

24

Where can DMS be applied?
? Program modification
– Application evolution
? Functionality change ? Performance change ? Technology change

?

Domain-specific program generation
– Partial Differential Equation solvers – Factory control synthesis – Entity-Relationship compilers
? DB conversion generators

– Massive changes
? Porting: new language, target APIs, ... ? Restructuring: Clone removal, Y2K fix, ... ? Optimization: Dead code, parallelize, ...

– Protocol compilers – Automated Test Generation

– Customize reusable component in new context

?

Legacy code reverse engineering
– design recovery to domain abstractions
? aid code understanding ? enable application evolution

?

Program Analysis
– Metrics
? SLOC, conditional, complexity

– Organization style checking – Programming information extraction
? Clones, Slices, Call Graphs, Side effects

– – – –

Incremental design capture reusable component extraction component extraction for domains Legacy mergers
? Unify data schemas ? Modify applications ? Convert existing data

– Domain information extraction
? Business rules ? Idiom recognition

– Business rule extraction
? Make explicit, easy to read/change

– Semantic Faults
? erroneous/dead/useless code

? Semantic Designs, Inc.

2011-3-26

25


相关文章:
Practical, AutomatedLarge ScaleSoftware Reengineering_图文_....ppt
Practical, AutomatedLarge ScaleSoftware Reengineering_经济学_高等教育_教育专区。Practical, AutomatedLarge ScaleSoftware Reengineering ...
Towards a Structural Clone Based Recommender System.pdf
Software Analysis, Evolution, and Reengineering ...Seclone-a hybrid approach to internet-scale real...In Automated Software Engineering, 2004. Proceedings...
REENGINEERING AND REFACTORING LARGE-SCALE SCIENTIFIC PROGRAMS....pdf
REENGINEERING AND REFACTORING LARGE-SCALE SCIENTIFIC PROGRAMS WITH THE UNIFIED ...Umbrello, a GNU public-licensed CASE tool, is small and practical. ...
An Exchange Model for Reengineering Tools Abstract.pdf
as a key issue in the reengineering of large scale object-oriented systems...the standard modelling language for object-oriented software, even in industry...
Academic vs. Industrial Software Engineering Closing the Gap_....pdf
ed IL makes sense only in large-scale projects,... which made it possible to achieve practical ...“Automated Software Reengineering”, S.Petersburg,...
PETALE Case Study of a Knowledge Reengineering Project.pdf
PETALE Case Study of a Knowledge Reengineering ... out of a large-scale reorganization project, ...automated way to see all files related to a ...
Software_Architecture_图文.ppt
architecture reengineering - High performance An average software project: - ...“Products” Large-Scale Organization/Entity Simulation Small Scientific ...
NOREX A Distributed Reengineering Environment.pdf
ts of distributed services and as such enabling large-scale community-based ...an agile reengineering environment. In Proceedings of the European Software ...
Reengineering OO Applications Refactorings Example Rename ....pdf
Large Class Long Parameter List Case Statement ...scale reverse engineering activities => convenient ...Systems special Issue on Software Reengineering, ...
Re-engineering Issues and Opportunities in XP key adaptive ....pdf
gained in a largescale industrial reengineering ...1. a. Unit Tests are automated tests written ...For commercial software with lots of customers, a...
...language-independent environment for reengineering object-....pdf
Environment for Reengineering Object-Oriented Systems...typically encountered in large-scale legacy systems...Automated Software Engineering, 3(1-2), June ...
Design, Experimentation,.pdf
and reengineering General Terms Algorithms, ...HAM scales and is platform independent. HAM is ...Center within the Automated Software Engineering ...
Ecole des Mines de Nantes.pdf
improve its maintenance process and software quality...This company produces softwares for large-scale ...case studies a practical method for reengineering....
What is Architecture.pdf
OORPT Object-Oriented Reengineering Patterns and ...evidence.,” Automated Software Engineering, April ...building large-scale, distributed multi-language ...
ANALYSIS PHASE ORGANIZATION......................................pdf
reengineering the SPS application software to allow...large-scale software projects that span over ...practical scenarios for each function of the ...
Object-oriented refactoring, legacy constraints and reuse_....pdf
object-oriented software reengineering, addressing ...large-scale, multi-year software development ...Opdyke, "An Automated Refactoring Tool". In ...
Workshop on Object-Oriented Reengineering.pdf
Workshop on Object-Oriented Reengineering_专业资料...h? neuc, “Automated Reverse-engineering of e ...visualisation techniques for large-scale software ...
Large-scale Computer-Assisted Assessment in Computer Science ....pdf
Large-scale Computer-Assisted Assessment in Computer Science Education New ...This is something that other users of automated program testing systems have...
Agent Based Dynamic Service Synthesis in Large-Scale Open ....pdf
practical progress can be made given the ...(automated reasoning, machine processable semantics ...Large-Scale Open Environments: Experiences from the...
Automated corpus analysis and the acquisition of large, multi....pdf
Automated corpus analysis and the acquisition of ...into a large-scale practical MT development ...software can be reused for any domain and ...