We aggregate and tag open source projects. We have collections of more than one million projects. Check out the projects section.
The first thing I do when I create a project is to create the debugger launch config at the `.vscode` folder. Debuggers help me to avoid putting print statements and building the program again. I always wondered how a debugger can stop the program on the line number I want and be able to inspect variables. Debugger workings have always been dark magic for me. At last, I managed to learn dark art by reading several articles and groking the source code of delve.
In this post, I'll talk about my learning while demystifying the dark art of debugger.
Problem statement
Let's define the problem statement before coding. I have a sample golang program that prints random integer every second. The goal which I want to achieve is that our debugger program should print `breakpoint hit` before our sample program prints the random integer.
Here is the sample program which prints random integer at every second.
package main
1. import (
2. "fmt"
3. "math/rand"
4. "time"
5. )
6. func main() {
7. for {
8. variableToTrace := rand.Int()
9. fmt.Println(variableToTrace)
10. time.Sleep(time.Second)
11. }
12. }
Now that we know what we want to achieve. Let's go step by step and solve the problem statement.
The first step is to pause the sample program before it prints the random int. That means we have to set the breakpoint at line number 8.
To set the breakpoint at line number 8, we must gather the address of instruction at line number 8.
Some of us know from high school that all high-level language is converted into assembly language at the end. So, how do we find the address of the instruction in the assembly language?
Luckily, compilers add debug information along with the optimized assembly instruction on the output binary. Debug information contains information related to the mapping of assembly code to high-level language.
For Linux binaries, debug information is usually encoded in the DWARF format.
DWARF is a debugging file format used by many compilers and debuggers to support source level debugging. It addresses the requirements of a number of procedural languages, such as C, C++, and Fortran, and is designed to be extensible to other languages. DWARF is architecture independent and applicable to any processor or operating system. It is widely used on Unix, Linux and other operating systems, as well as in stand-alone environments.
DWARF format can be parsed using objdump tool.
The below command will output all the addresses of the instruction and it's mapping to the line number and file name.
objdump --dwarf=decodedline ./sample
objdump command will output similar to this:
File: /home/debugger-example/sample.go
File name Line number Starting address View Stmt
sample.go 6 0x498200 x
sample.go 6 0x498213 x
sample.go 7 0x498221 x
sample.go 8 0x498223 x
sample.go 8 0x498225
sample.go 9 0x498233 x
sample.go 9 0x498236
sample.go 10 0x4982be x
sample.go 10 0x4982cb
sample.go 8 0x4982cd x
sample.go 9 0x4982d2
sample.go 6 0x4982d9 x
sample.go 6 0x4982de
sample.go 6 0x4982e0 x
sample.go 6 0x4982e5 x
The output clearly states that `0x498223` is the starting address of line number 8 for sample.go file.
The next step is to pause the program at the address `0x498223`
Trick to pause the program execution
CPU will interrupt the program whenever it sees data integer 3. So, we just have to rewrite the data at the address `0x498223` with the data []byte{0xcc} to pause the program.
In computing and operating systems, a trap, also known as an exception or a fault, is typically a type of synchronous interrupt caused by an exceptional condition (e.g., breakpoint, division by zero, invalid memory access). Source: Wikipedia
Does that mean we have to rewrite the binary at `0x498223`? No, we can write it using ptrace.
Ptrace to rescue
ptrace is a system call found in Unix and several Unix-like operating systems. By using ptrace (the name is an abbreviation of "process trace") one process can control another, enabling the controller to inspect and manipulate the internal state of its target. ptrace is used by debuggers and other code-analysis tools, mostly as aids to software development. Source: Wikipedia
ptrace is a syscall that allows us to rewrite the registers and write the data at the given address.
Now we know which address to pause and how to find the memory representing lines, and manipulate the memory of the sample program. So, let's put all this knowledge into action.
exec a process by setting Ptrace flag to true, so that we can use ptrace on the execed process.
process := exec.Command("./sample")
process.SysProcAttr = &syscall.SysProcAttr{Ptrace: true, Setpgid: true,
Foreground: false}
process.Stdout = os.Stdout
if err := process.Start(); err != nil {
panic(err)
}
The breakpoint can be set at `0x498223` by replacing the original data with integer 3 (0xCC). This can be done by `PtracePokeData`.
func setBreakpoint(pid int, addr uintptr) []byte {
data := make([]byte, 1)
if _, err := unix.PtracePeekData(pid, addr, data); err != nil {
panic(err)
}
if _, err := unix.PtracePokeData(pid, addr, []byte{0xCC}); err != nil {
panic(err)
}
return data
}
You must already be wondering why there is `PtracePeekData`, other than `PtracePokeData`. `PtracePeekData` allows us to read the memory at the given address. I'll explain later why I'm reading the data at the address `0x498223`.
Since we set the breakpoint we'll continue the program and wait for the interrupt to happen. This can be done by `PtraceCont` and `Wait4`
if err := unix.PtraceCont(pid, 0); err != nil {
panic(err.Error())
}
/* wait for the interupt to come.*/
var status unix.WaitStatus
if _, err := unix.Wait4(pid, &status, 0, nil); err != nil {
panic(err.Error())
}
fmt.Println("breakpoint hit")
After the breakpoint hits, we need the program to continue as usual. Since we already modified the data at `0x498223` the program doesn't run as usual. So we need to replace the integer 3 with original data. Remember, we captured the original data at `0x498223` using `PtracePeekData` while setting the breakpoint. Let's just revert to the original data at `0x498223`.
if _, err := unix.PtracePokeData(pid, addr, data); err != nil {
panic(err.Error())
}
Just reverting to original data doesn't run the program as usual. Because the instruction at `0x498223` is already executed when breakpoint hits.
So, we want to tell the CPU to execute the instruction again at `0x498223`.
CPU executes the instruction that the instruction pointer points to. If you have studied microprocessors at university, you might remember.
So, that means if we set the instruction pointer to `0x498223` then the CPU will execute the instruction at `0x498223` again.CPU registers can be manipulated using`PtraceGetRegs` and `PtraceSetRegs`.
regs := &unix.PtraceRegs{}
if err := unix.PtraceGetRegs(pid, regs); err != nil {
panic(err)
}
regs.Rip = uint64(addr)
if err := unix.PtraceSetRegs(pid, regs); err != nil {
panic(err)
}
Now that we modified the register, if we continue the program then it'll execute the normal flow. But we want to hit the breakpoint again, so we'll tell the ptrace to execute only the next instruction and set the breakpoint again. `PtraceSingleStep` allows us to execute only one instruction.
func resetBreakpoint(pid int, addr uintptr, originaldata []byte) {
/* revert back to original data*/
if _, err := unix.PtracePokeData(pid, addr, originaldata); err != nil {
panic(err.Error())
}
/* set the instruction pointer to execute the instruction again */
regs := &unix.PtraceRegs{}
if err := unix.PtraceGetRegs(pid, regs); err != nil {
panic(err)
}
regs.Rip = uint64(addr)
if err := unix.PtraceSetRegs(pid, regs); err != nil {
panic(err)
}
if err := unix.PtraceSingleStep(pid); err != nil {
panic(err)
}
/* wait for it's execution and set the breakpoint again */
var status unix.WaitStatus
if _, err := unix.Wait4(pid, &status, 0, nil); err != nil {
panic(err.Error())
}
setBreakpoint(pid, addr)
}
So far we have learned how to manipulate registers and set breakpoints. Let's put all these into a for loop and drive the program.
pid := process.Process.Pid
data := setBreakpoint(pid, 0x498223)
for {
if err := unix.PtraceCont(pid, 0); err != nil {
panic(err.Error())
}
/* wait for the interrupt to come. */
var status unix.WaitStatus
if _, err := unix.Wait4(pid, &status, 0, nil); err != nil {
panic(err.Error())
}
fmt.Println("breakpoint hit")
/* reset the breakpoint */
resetBreakpoint(pid, 0x498223, data)
}
Phew, Finally we able to print `breakpoint hit` before our sample program prints random integer.
breakpoint hit
6129484611666145821
breakpoint hit
4037200794235010051
breakpoint hit
3916589616287113937
breakpoint hit
6334824724549167320
breakpoint hit
605394647632969758
breakpoint hit
1443635317331776148
breakpoint hit
894385949183117216
You can find the full source code at https://github.com/poonai/debugger-example
That's all for now. Hope you folks learned something new. In the next post, I'll write how to extract values from the variables by reading DWARF info.
Plug
By the way, I've built a free vs-code extension that allows developers to set logpoint and get logs from the production system straight to your vscode console. You can check it out by going to quicklog.dev or you can discuss on our discord server https://discord.gg/suk99uC5fa
Subscribe to our newsletter.
We will send mail once in a week about latest updates on open source tools and technologies. subscribe our newsletterPHP, the general-purpose scripting language has been used since decades for socket programming and web development. But in recent times, Python has become the most sought after programming language. This all-purpose programming language is attracting more developers in the industry owing to its highly dynamic and extensible nature. Let's see how Python is winning over age-old PHP.
Web developers come across scenarios like web application completely breaks when workstation goes offline. Likewise to get into our application, every time we need to open a browser and then access it. Instead if it is in app, it will be easy to access for end-user. Push notifications similar to email client need to be done through web application. All these are addressed by a magic called service worker.
Bitcoin is an open source digital currency which could be transferred in a P2P payment network. It is decentralized and it is not controlled by any central authority or banks. It is transferred from person to person and no authority will be aware of your transaction. Its quite different from PayPal or Banks.
Light 4j is a fast, lightweight and cloud-native microservices framework. In this article, we will see what and how hybrid framework works and integrate with RDMS databases like MySQL, also built in option of CORS handler for in-flight request.
Notifications is a message pushed to user's device passively. Browser supports notifications and push API that allows to send message asynchronously to the user. Messages are sent with the help of service workers, it runs as background tasks to receive and relay the messages to the desktop if the application is not opened. It uses web push protocol to register the server and send message to the application. Once user opt-in for the updates, it is effective way of re-engaging users with customized content.
UnQLite is an embedded NoSQL database engine. It's a standard Key/Value store similar to the more popular Berkeley DB and a document-store database similar to MongoDB with a built-in scripting language called Jx9 that looks like Javascript. Unlike most other NoSQL databases, UnQLite does not have a separate server process. UnQLite reads and writes directly to ordinary disk files. A complete database with multiple collections is contained in a single disk file. The database file format is cross-platform, you can freely copy a database between 32-bit and 64-bit systems or between big-endian and little-endian architectures.
Redis is an in-memory data structure store, used as a database, cache and message broker. It supports data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs, geospatial indexes with radius queries and streams. This blog covers the advanced concepts like cluster, publish and subscribe, pipeling concepts of Redis using Jedis Java library.
Rowy an open-source platform to manage your data in an intuitive spreadsheet-like UI. Say goodbye to emailing that "vFinalFinal" Excel sheet. It helps to write Cloud Functions effortlessly in the browser, and connect to your favorite third party platforms such as SendGrid, Twilio, Algolia, Slack and more.
This is the most frequently asked questions in the interview. Googling will throw many links related to this topic. How to learn the implementation of hash map? My style of learning using open source learning technique.
Activiti Cloud is the first Cloud Native BPM framework built to provide a scalable and transparent solution for BPM implementations in cloud environments. The BPM discipline was created to provide a better understanding of how organisations do their work and how this work can be improved in an iterative fashion.
Redis is an open source (BSD licensed), in-memory data structure store, used also as a database cache and message broker. It is written in ANSI C and works in all the operating systems. It supports data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs, geospatial indexes with radius queries and streams. This article explains about how to install Redis.
Web Real-Time Communication (WebRTC) is an open source project currently being developed with an aim to provide real time, peer-to-peer communication between web applications. WebRTC provides simple JavaScript APIs that help developers to easily build web applications with real time audio, video and data transfer capabilities. This blog has been written assuming that the reader has zero knowledge of how WebRTC works and hence have explained the entire working in detail using simple terms and analogies wherever possible. Let’s get started!
Mkcert is go-lang project, which is super easy tool to setup certificate authority without any configuration. Using certificates are inevitable these days, data should be transferred in a secure communication channel. Buying a certificate is expensive and mostly companies buy certificates only for production systems. In Dev setup, if we use self-signed certificate then there will be trust errors. mkcert automatically creates and installs a local CA in the system root store, and generates locally-trusted certificates.
Univention Corporate Server is an open source identity management system, an IT infrastructure and device management solution and an extensible platform with a store-like App Center that includes tested third party applications and further UCS components: This is what Univention combines in their main product Univention Corporate Server, a Debian GNU/Linux based enterprise distribution. This article provides you the overview of Univention Corporate Server, its feature and installation.
Alexa is a web information company promoted by Amazon. It provides traffic, page views, reach, etc for the web sites.Alexa ranking is widely used to rate the web site. Ranking is in increasing order. High traffic sites has lesser the rank value and poor traffic web sites will have higher the rank value. Google is ranked 1. Follow our steps, how we increased the rank from 3 million to 300,000.
Hazelcast is an open source In-Memory Data Grid (IMDG). It provides elastically scalable distributed In-Memory computing, widely recognized as the fastest and most scalable approach to application performance. Hazelcast makes distributed computing simple by offering distributed implementations of many developer-friendly interfaces from Java such as Map, Queue, ExecutorService, Lock and JCache.
We show lot of data in our web applications, it will be awesome if we quickly download specific part of PDF rather than printing it. It will be easy to share for different stakeholders and also for focused meetings. In web application development, download to PDF means, we need to develop backend api specifically and then link it in frontend which takes longer development cylce. Instead it would be really great, if there is way to download what we see in the user interface quickly with few lines of Javascript, similar to export options in word processing application.
The release 4.0 is one of the important milestone for Lucene and Solr. It has lot of new features and performance important. Few important ones are highliggted in this article.
Many new products are coming in the open source world. Few are forking existing project, adding new features to it and selling it as open source product. Few strategies required to follow to sell the product better.
Nginx is a High Performance Web Server, Proxy Server, Content Cache and Reverse Proxy server. It can also be used as mail proxy server and a generic TCP/UDP proxy server. Nginx claims to be more efficient and faster in the Web space compared to the other web servers. This can be evident with the architecture which is based on asynchronous event-driven approach. The event driven architecture enables to scale to hundreds / thousands of concurrent connections.
We have large collection of open source products. Follow the tags from
Tag Cloud >>
Open source products are scattered around the web. Please provide information
about the open source projects you own / you use.
Add Projects.