# Data structure for text editor

36,404

## Solution 1

On good old ZX-Spectrum one (or more, I do not know) text exditor used very simple structure.

There was one big buffer, which occupied all free RAM. Text was split in two parts at the cursor. Part before the cursor, was placed at the beginning of the buffer, and the rest at the end of the buffer. As text typed, data simply added to the end of first part, and when cursor is moved, text is copied forth and back.

Buffer layout:

``````Hello, World!
^------Cursor here

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|H|e|l|l|o|,| |W| <free>  |o|r|l|d|!|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                ^         ^        |
begin           cur1      cur2    end
``````

That's, how some edit operations was made:

``````Type a char:    buffer[cur1++] = character

Backspace:      cur1--

Cursor left:    buffer[--cur2] = buffer[--cur1]

Cursor right:   buffer[cur1++] = buffer[cur2++]
``````

Buffer in action:

``````             Hello, W..............orld!
Press right          ^             ^
Hello, Wo..............rld!
Press backspace       ^             ^
Hello, W...............rld!
Press 0              ^              ^
Hello, W0..............rld!
^             ^
``````

## Solution 2

Rope

A rope is essentially a binary tree whose leaves are arrays of characters. A node in the tree has a left child and a right child - the left child is the first part of the string, while the right child is the final part of the string. Concatenation of two ropes simply involves the creation of a new tree node with both ropes as children. To ensure logarithmic time indexing and sub-string operations the resulting rope may need to be balanced. Various balancing strategies are possible.

The main advantages of ropes as compared to storing strings as character arrays is that they enable much faster concatenation than ordinary strings, and don't require a large contiguous memory space to store a large string. The main disadvantages are greater overall space usage and slower indexing, both of which become more severe as the tree structure becomes larger and deeper. However, many practical applications of indexing involve only iteration over the string, which remains fast as long as the leaf nodes are large enough to benefit from cache effects.

## Solution 3

I know it's too late for an answer, but I found The Craft of Text Editing book really useful. It contains description of several buffer models with their pros and cons. Unfortunately, it doesn't mention Ropes data structure.

## Solution 4

You might find this interesting, even if it does not exactly answer your question:

Most efficient data structure to add styles to text

I am hoping that the discussion will go to fascinating places :-)

## Solution 5

As @Vovanium already mentioned the basic theory of how gap buffer can be used, I have implemented a version of C/C++.

Code:

``````#include <stdio.h>
#define SZ 1000

char leftArray[SZ], rightArray[SZ];
int leftCount, rightCount;
int cursorPos;

/*
* Private APIs
*/

void printArray(){

for(register int i = 0; i < leftCount; i++){
printf("%c", leftArray[i]);
}

for(register int i = rightCount - 1; i >= 0; i--){
printf("%c", rightArray[i]);
}
printf("\n");
}

while(leftCount > pos){
rightArray[rightCount++] = leftArray[--leftCount];
}

while(leftCount < pos){
leftArray[leftCount++] = rightArray[--rightCount];
}
}

/*
* Public APIs for Text Editor
*/

void init(){

cursorPos = leftCount = rightCount = 0;
}

for(register int i = 0; i < len; i++){
leftArray[leftCount++] = word[i];
}
leftArray[leftCount] = 0;
}

leftCount--;
}

void moveCurson(int newPosition){

cursorPos = newPosition;
}

void subString(int pos, int length, char result[]){

int k = 0;
int l = 0;
while(k + pos < leftCount && k < length){
result[k] = leftArray[pos + k];
k++;
}

length -= k;
while( l < length){
result[k++] = rightArray[rightCount - 1 - l];
l++;
}
}
``````
Share:
36,404
Author by

### Michael

Scala (moslty) programmer

Updated on March 04, 2020

• Michael almost 3 years

This is an interview question. What data structure would you use to store the text in a text editor?

+1 for telling me what the structure I re-invented (sic), described and suggested in one of my posts is called officially :-)
• Tony Delroy about 12 years
@thkala: pays to do some research first - nothing new under the sun ;-)
• Tony Delroy about 12 years
As well as concatenation, Rope's are typically relatively good (compared to entirely contiguous storage) for many deletion, insertion and substitution actions, especially near the start of the document or where growing the contiguous storage would require a move in memory.