Copy only Specific text of a file to another
Solution 1
I assume the file follows the same pattern. If that is the case, you can have a command like below.
grep -o ' path=.*$' file.txt | cut -c8- |rev | cut -c 4- | rev
So, I open the file using cat
and then I extract only the characters from path=
and then I remove the unwanted characters using cut
and then I use the rev
technique to remove unwanted characters from the end.
Another awk approach
awk -F'path="' '{print $2}' file.txt |rev | cut -c 4- | rev
I use the path="
as delimiter and print all the information after it. And the rev
basically does the same as above.
Testing
cat file.txt
<classpathentry kind="src" path="Sources"/>
<classpathentry kind="con" path="WOFramework/ERExtensions"/>
<classpathentry kind="con" path="WOFramework/ERJars"/>
<classpathentry kind="con" path="WOFramework/ERPrototypes"/>
<classpathentry kind="con" path="WOFramework/JavaEOAccess"/>
<classpathentry kind="con" path="WOFramework/JavaEOControl"/>
<classpathentry kind="con" path="WOFramework/JavaFoundation"/>
<classpathentry kind="con" path="WOFramework/JavaJDBCAdaptor"/>
After running the command,
Sources
WOFramework/ERExtensions
WOFramework/ERJars
WOFramework/ERPrototypes
WOFramework/JavaEOAccess
WOFramework/JavaEOControl
WOFramework/JavaFoundation
WOFramework/JavaJDBCAdaptor
A better approach as provided by Stephane in comments.
cut -d '"' -f4 file.txt
Solution 2
A simple approach with awk
:
awk -F\" '/WOF/ {print $4}' abc.txt > outfile
-F\"
changes the field separator from the default (a space) to a quote mark (escaped with\
)/WOF/
restricts the returned results of each record (line of the file) to those that match the pattern:WOF
$4
is the fourth field for each of those matching records, the path.
Solution 3
sed -n '/.*="con"[^"]*./{s///;s/..>//p}' <<\DATA
<classpathentry kind="src" path="Sources"/>
<classpathentry kind="con" path="WOFramework/ERExtensions"/>
<classpathentry kind="con" path="WOFramework/ERJars"/>
<classpathentry kind="con" path="WOFramework/ERPrototypes"/>
<classpathentry kind="con" path="WOFramework/JavaEOAccess"/>
<classpathentry kind="con" path="WOFramework/JavaEOControl"/>
<classpathentry kind="con" path="WOFramework/JavaFoundation"/>
<classpathentry kind="con" path="WOFramework/JavaJDBCAdaptor"/>
DATA
OUTPUT
WOFramework/ERExtensions
WOFramework/ERJars
WOFramework/ERPrototypes
WOFramework/JavaEOAccess
WOFramework/JavaEOControl
WOFramework/JavaFoundation
WOFramework/JavaJDBCAdaptor
This should get only the WO... stuff, I think. It's also fully portable.
Solution 4
Another approach with grep and cut:
grep "kind=\"con\"" sample.txt | cut -d \" -f 4 > sample_edited.txt
This will grep all lines containing kind="con"
and print the paths by setting cut
's delimiter to "
.
Solution 5
Another solution if your version of grep
supports PCRE-style lookarounds
grep -oP '(?<=kind="con" path=").+?(?="/>)' abc.txt
Related videos on Youtube
![gkmohit](https://i.stack.imgur.com/tfScI.jpg?s=256&g=1)
gkmohit
I am an Entrepreneur, Web Designer and Online Business Consultant. My mission is to help small businesses grow by leveraging the power of the internet. I believe in automating tasks by using tools so that you can focus on your core business. I have always been a curious person. The first time I used a computer was in grade 8 and fascinated by how you could create digital art using Corel Draw. In class 10, I had the opportunity to use the first mobile phone, and I was very intrigued by how the OS integrated with the hardware. That same curiosity led me to write my first piece of code in grade 10, and I then realized the power a programmer had in this world. In the mid-2011 family and I moved from Bangalore, India to Toronto, Canada, where I started my undergraduate degree in Computer Science at York University. As a student, I couldn't wait to get some industry experience, and I was fortunate to land my first job in IT at the University Information Technology department. I started as a Technical Analyst and slowly grew to be a software developer at the Student Information System. Gaining some industry experience gave me the confidence to go and attend a few hackathons across North America. I was fortunate to win a few awards from companies like Google, IBM, Bank of Nova Scotia and more while attending hackathons. With the help of my awards, experience and my skills, I started my internship at SAP Labs in Waterloo, Canada. My course was great, but I was seeking something more challenging, so my hackathon team members and I decided to start a fast-growing development shop Hyfer Technologies. At Hyfer Technologies, I stumbled upon Product Management and Business Analysis while managing a team of developers remotely. So far, I have been able to work with 10+ clients from conception to production. As a product manager, I have had a few failed projects but also some that are still growing strong. As of March 2020, I am working with The Ottawa Hospital as a Business Analyst. As a Product Manager & Business Analyst, my skills include but are not limited to: Management Strategy Growth Strategy Customer, partner and client relations, Organizational Design Process Improvements Statistical Analysis and Data Mining Marketing and Brand Strategy Running Product-Related Sessions Managing technical team Through these skills and experience, I am confident I can add a lot of values to any growing team. I am always open to learning more about you and your business. Feel free to reach out to me or follow me on LinkedIn.
Updated on September 18, 2022Comments
-
gkmohit almost 2 years
I have a file abc.txt the contents are
<classpathentry kind="src" path="Sources"/> <classpathentry kind="con" path="WOFramework/ERExtensions"/> <classpathentry kind="con" path="WOFramework/ERJars"/> <classpathentry kind="con" path="WOFramework/ERPrototypes"/> <classpathentry kind="con" path="WOFramework/JavaEOAccess"/> <classpathentry kind="con" path="WOFramework/JavaEOControl"/> <classpathentry kind="con" path="WOFramework/JavaFoundation"/> <classpathentry kind="con" path="WOFramework/JavaJDBCAdaptor"/>
I want to copy all the paths into another file. That is I want my output text file to look like:
WOFramework/ERExtensions WOFramework/ERJars WOFramework/ERPrototypes WOFramework/JavaEOAccess WOFramework/JavaEOControl WOFramework/JavaFoundation WOFramework/JavaJDBCAdaptor
-
Remon about 10 yearsyou want to copy depending on kind?
-
Mikel about 10 yearsLooks like you're trying to extract parts of an XML document. Try an XML tool such as
xmlstarlet
orxmllint
. stackoverflow.com/questions/91791/… -
Stéphane Chazelas about 10 years
cut -d '"' -f4
? -
Ramesh about 10 years@StephaneChazelas, your answer should be the best solution :)
-
-
mikeserv about 10 yearsHe doesn't want "Sources"
-
mikeserv about 10 yearsActually, I guess he accepted it - so what do I know?
-
text about 10 yearssed -e 's/.*path="//' -e 's:".*$::' abc.txt > output_file -- dropping everything after the last quote instead of specific matching at the end.
-
Avinash Raj about 10 yearsit display extra lines.
-
Mathias Begert about 10 years@AvinashRaj, where are you seeing extra lines in the OP's input data? The answer above is tailored to the OP's data.
-
Avinash Raj about 10 yearsit displays
sources
also according to the op's input.