How can I transform rows into repeated column based data?

12,844

Solution 1

Just for fun:

# Input file format is tab separated values

# name  search_term address code
# Jim jim jim_address 123
# Bob bob bob_address 124
# Lisa  lisa  lisa_address  126
# Mona  mona  mona_address  129


infile = File.open("inputfile.tsv")

headers = infile.readline.strip.split("\t")
puts headers.inspect
of = File.new("outputfile.tsv","w")
infile.each_line do |line|
  row = line.split("\t")
  headers.each_with_index do |key, index|
    of.puts "#{key}\t#{row[index]}"
  end
end

of.close



# A nicer way, on my machine it does 1.6M rows in about 17 sec

File.open("inputfile.tsv") do | in_file |
  headers = in_file.readline.strip.split("\t")
  File.open("outputfile.tsv","w") do | out_file |
    in_file.each_line do | line |
      row = line.split("\t")
      headers.each_with_index do | key, index | 
        out_file << key << "\t" << row[index]
      end
    end 
  end
end

Solution 2

You could add an ID column to the left of your data and use a Reverse PivotTable method.

  • Press Alt+D+P to access the Pivottable Wizard with the steps:

    1.  Multiple Consolidation Ranges
    2a. I will create the page fields
    2b. Range: eg. sheet1!A1:A4 
        How Many Page Fields: 0
    3.  Existing Worksheet: H1
    
  • In the PivotTable:

    Uncheck Row and Column from the Field List
    Double-Click the Grand Total as shown
    

enter image description here

Share:
12,844
OpenCoderX
Author by

OpenCoderX

Ruby, Ruby on Rails, Python, JavaScript, JQuery, HTML. Learning Java.

Updated on June 27, 2022

Comments

  • OpenCoderX
    OpenCoderX about 2 years

    I'm trying to take a dataset that looks like this:

    Source format of data

    And transform the records into this format:

    Destination Format

    The resulting format would have two columns, one for the old column names and one column for the values. If there are 10,000 rows then there should be 10,000 groups of data in the new format.

    I'm open to all different methods, excel formulas, sql (mysql), or straight ruby code would work for me also. What is the best way to tackle this problem?

    • mellamokb
      mellamokb almost 12 years
      Display format is generally an application-level concern and should be handled by your application code (ruby).
    • barancw
      barancw almost 12 years
      What is the point of getting this data into a new format? Is it only for human readability or is it to import this into another system?
    • LanguagesNamedAfterCofee
      LanguagesNamedAfterCofee almost 12 years
      Take a look at this Railscast railscasts.com/episodes/362-exporting-csv-and-excel You're not using Rails, but it will still help.
    • OpenCoderX
      OpenCoderX almost 12 years
      The point is to import it into another system. It is a hard requirement, they have built a lot of code around this format.
    • mellamokb
      mellamokb almost 12 years
      @opensourcechris: Sorry, I'm not familiar with Ruby :(