Software for translating website content
Thread poster: trans-agrar
trans-agrar
trans-agrar  Identity Verified
Germany
Local time: 23:58
English to German
+ ...
Jan 11

Hello folks,
a new client asked me for quoting the translation of his website. They don't have the content offline, so I either work/translate directly into the backend (too time-consuming) or they somehow extract the texts from the website.

Using a crawler does not seem to be an elegant method.
Saving page by page to htm and then translate in a Cat Tool, seems to be feasible.

Yet, aren't there further ways of dealing with jobs like these? Wasn't Passolo mad
... See more
Hello folks,
a new client asked me for quoting the translation of his website. They don't have the content offline, so I either work/translate directly into the backend (too time-consuming) or they somehow extract the texts from the website.

Using a crawler does not seem to be an elegant method.
Saving page by page to htm and then translate in a Cat Tool, seems to be feasible.

Yet, aren't there further ways of dealing with jobs like these? Wasn't Passolo made for this? Is anybody out there who has been using it? Or would anybody have another idea of dealing with cms content that is not available offline?

Looking forward to smart ideas
Many thanks
Barbara
Collapse


Tom in London
 
Zea_Mays
Zea_Mays  Identity Verified
Italy
Local time: 23:58
Member (2009)
English to German
+ ...
TMS Jan 11

If someone had uploaded the pages to the web space provider's servers, somewhere there are also the original files.
For a large website I would recommend using a TMS (translation management system) like Smartling or Phrase, as you also need to
... See more
If someone had uploaded the pages to the web space provider's servers, somewhere there are also the original files.
For a large website I would recommend using a TMS (translation management system) like Smartling or Phrase, as you also need to localize the UI, not just the content.
https://phrase.com/blog/posts/9-steps-to-get-your-website-localization-started/
https://www.smartling.com/resources/101/website-translation-services/.
Collapse


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 23:58
Member (2006)
English to Afrikaans
+ ...
@Barbara Jan 11

trans-agrar wrote:
A new client asked me for quoting the translation of his website. They don't have the content offline...

A-ha, so this means that if you do generate HTML versions of their pages (and translate it), they won't be able to do with the HTML versions.

The HTML version that appears in your browser is an exported version that was generated by their web server on the fly, and it can't be imported back into their system. In the old days of the web, web servers contained actual HTML pages that they simply sent to the browser, so it was possible to translate a web site by simply "downloading" or receiving all the files in HTML format, translate it and then have the client upload it again.

This means that you have two options: do the translation in e.g. Word, and then paste it into their back-end manually, or you hope that they can somehow export it in a format that they can import back AND which you are able to translate in a CAT tool. If their web developer does promise that they can import a file back in, do a couple of short tests first, to see if the web developer truly understands what you're asking of him and what he's promising (and to see if you are able to open their files in your CAT tool).

The Word method works by visiting the page in your web browser, then selecting the relevant text and copy/paste it into Word. Then do the translation in your favourite CAT tool. Then, convert your Word file into a plain text TXT format, and open it in Notepad (or: copy/paste from Word into a blank Notepad document, and then save the Notepad file as a TXT file). Then, go to the back-end, and open the back-end's editor. Then, copy the textual content from Notepad, and paste it into the back-end, and then manually apply the correct formatting. DO NOT simply copy content directly from a Word file into the back-end (even if you're using a keyboard shortcut that supposedly pastes plain text), because you'll inadvertently copy hidden formatting codes that will be a headache to fix afterwards.

Zea_Mays wrote:
If someone had uploaded the pages to the web space provider's servers, somewhere there are also the original files.

True, but this only applies to websites that do not use a CMS.

[Edited at 2024-01-11 09:07 GMT]


Zea_Mays
Joakim Braun
 
Zea_Mays
Zea_Mays  Identity Verified
Italy
Local time: 23:58
Member (2009)
English to German
+ ...
@Samuel Jan 11

Sure, CMS (or online website creation tools) count. Especially in the latter case you could use the same system creating copies of the existing pages. The main issue remains the User Interface.
You could also copy the html source text (press CMD+U or similar key combination) of the single pages, if you are able to use it.
But why all that hassle if the best method is using a TMS?



[Bearbeitet am 2024-01-11 09:20 GMT]


 
trans-agrar
trans-agrar  Identity Verified
Germany
Local time: 23:58
English to German
+ ...
TOPIC STARTER
TMS versus Cat Tool Jan 11

Thanks Samuel and Zea


Zea_Mays wrote:

Sure, CMS (or online website creation tools) count. Especially in the latter case you could use the same system creating copies of the existing pages. The main issue remains the User Interface.
You could also copy the html source text (press CMD+U or similar key combination) of the single pages, if you are able to use it.



[Bearbeitet am 2024-01-11 09:20 GMT]


Do you mean "press Control + U"? This gives me all the formatting information of the page not just the source text. Am I doing something wrong?


I also checked out Phrase. So, is this an offline app? How does it access the content that needs translating? Sorry if I become a pain in the neck.

Thanks
Barbara


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 23:58
Member (2006)
English to Afrikaans
+ ...
@Zea Jan 11

Zea_Mays wrote:
But why all that hassle if the best method is using a TMS?

Well, to use a TMS, you need to be able to either supply offline files to the TMS, or you need to be able to connect the website to the TMS.

Connecting the website to the TMS is something that the client's web developer may or may not be able to do, depending on what platform they're using, what add-ons are available, and how skilled they are at figuring out the documentation of the TMS.

But sure, if the OP is willing to use Smartling or Phrase (which is another question), then of course she can mention this to the client, although trying to connect a website to a TMS is likely going to cost money.


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 23:58
Member (2006)
English to Afrikaans
+ ...
@Barbara Jan 11

trans-agrar wrote:
Do you mean "press Control + U"?

Zea is probably on a Mac, where the Control button is called "Command". On many browsers, the shortcut Ctrl+U will show you the web page's source code.

This gives me all the formatting information of the page not just the source text.

Yes, it gives you the "source code" (not "source text"), which is all the text as well as all the HTML codes that make up the formatting and functionality of the web page. My opinion is that this source code won't be useful to you.

I also checked out Phrase. So, is this an offline app?

Phrase is an online tool, but I believe they also have an offline app that you can use to access the online content offline. (By "online" here we don't mean the content on your client's website, but the content on the Phrase website.)

The client's web developer would have to connect his website to Phrase somehow, and then select the precise text that should be translated, so that you only see the text that the client wants you to translate, and then Phrase will show the sentences in its own CAT tool. Then, if you use the offline app, you would be able to translate it offline, and then upload of synchronize with Phrase to get your translations back into the online version of Phrase... and from there, the client's web developer would have to import your translations back into his own website.

I don't personally think that Smartling or Phrase is an option for you -- my opinion is that it was designed for very large projects with large budgets and dedicated project managers. Others may disagree with my opinion.

[Edited at 2024-01-11 10:30 GMT]


 
Zea_Mays
Zea_Mays  Identity Verified
Italy
Local time: 23:58
Member (2009)
English to German
+ ...
How I would proceed Jan 11

1) Ask the client who looks after their website (generally, there is someone who publishes news and/or new articles, updates prices etc.).
2) When there is such a person, ask to get in touch with them.
3) Ask this person what system is used to manage the website and a) how many pages there are (this is very important) and b) if a TMS can be linked to that system.
Just repeating: It is not difficult to localize content itself, but to implement it in the new pages, and - above al
... See more
1) Ask the client who looks after their website (generally, there is someone who publishes news and/or new articles, updates prices etc.).
2) When there is such a person, ask to get in touch with them.
3) Ask this person what system is used to manage the website and a) how many pages there are (this is very important) and b) if a TMS can be linked to that system.
Just repeating: It is not difficult to localize content itself, but to implement it in the new pages, and - above all - you have to localize also the UI. How do you plan to do this? How do you find all those little parts in the navigation for example? And how will you implement the localized versions?
If you have excellent website development knowledge, you can do this with ease. If not, you need to work with the person who manages the website or with a website developer to be hired if necessary.
Collapse


Joakim Braun
Matteo Rozzarin
 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 23:58
Member (2006)
English to Afrikaans
+ ...
@Barbara Jan 11

Zea_Mays wrote:
Just repeating: It is not difficult to localize content itself, but to implement it in the new pages, and - above all - you have to localize also the UI. How do you plan to do this? How do you find all those little parts in the navigation for example? And how will you implement the localized versions?

Zea makes a very good point: your client may not be aware of this, but different parts of the website are often edited in entirely different places, and the UI (i.e. the menus at the top, the "Read more" buttons, the footer, various buttons etc.) is often part of different systems than the actual paragraph content of the pages.

Perhaps your client only wants you to translate the paragraph content in the middle of the page (and doesn't care that the surrounding text remains in the original language), and if so, then it's fine -- you don't need to worry about the UI.

But if the client wants the UI to be translated as well, then that is going to take a lot more work and skill from the web developer. You'd have to communicate with that person extensively to find out what they can or cannot do.

On the other hand, it is entirely possible that the client might be quite willing to paste your translation into the various places themself, and they might be happy if you just supply the translation in a two-column Word file (source text in the left column, your translation in the right column, segmented by paragraph). You'd have to speak to the client and/or their web developer to find out what they can do and would want you to do.


 
Joakim Braun
Joakim Braun  Identity Verified
Sweden
Local time: 23:58
German to Swedish
+ ...
Copy-pasting Jan 11

Samuel Murray wrote:

(...) go to the back-end, and open the back-end's editor. Then, copy the textual content from Notepad, and paste it into the back-end, and then manually apply the correct formatting.


It's unlikely to be as simple as that. In a CMS the displayed data may be generated by any combination of entities, custom modules and relationships, the source point can be quite hard to figure out even for a developer. The data may also contain text generated by embedded links to the back-end (say, to factboxes or author info in a news article), which the HTML code won't tell you.


 
trans-agrar
trans-agrar  Identity Verified
Germany
Local time: 23:58
English to German
+ ...
TOPIC STARTER
Thanks Jan 12

Thank you all for taking the time and sharing your knowledge. I have in fact been talking to the web developer and they suggested two options to proceed:

1- read out all links and have a crawler run through the content. He says this is quite tricky as to a correct configuration and also involves the risk of producing double content and other issues.

2 - read out all links from the site map and copy paste the content into Word for translation and then copy the translate
... See more
Thank you all for taking the time and sharing your knowledge. I have in fact been talking to the web developer and they suggested two options to proceed:

1- read out all links and have a crawler run through the content. He says this is quite tricky as to a correct configuration and also involves the risk of producing double content and other issues.

2 - read out all links from the site map and copy paste the content into Word for translation and then copy the translated texts into the backend text boxes. I guess translating the UI elements should then be made online and directly into the backend.

After reading your inputs, I guess option 2 is the only reasonable option left. I fancied a more elegant solution and at the same time wanted to make sure I am offering a procedure that is state of the art. I feel reassured now that we should take option 2.

Thanks again and hello from frosty Heidelberg
Barbara
Collapse


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 23:58
Member (2006)
English to Afrikaans
+ ...
@Barbara Jan 12

trans-agrar wrote:
2 - read out all links from the site map...

It's fantastic if they can supply you with a site map (presumably an XML one).

...and copy paste the content into Word for translation and then copy the translated texts into the backend text boxes.

I recommend that you do not copy/paste directly from Word, even if you do do the translation in Word. Word files contain added formatting codes that you don't want to be added to the website, but the WYSIWIG edit boxes on websites' backends tend to try to retain that extra formatting, and while you may not see anything wrong yourself, a different user using a different browser may end up seeing the text formatted in an odd kind of way.


 
content extraction Jan 12

Zea_Mays wrote:

1) Ask the client who looks after their website (generally, there is someone who publishes news and/or new articles, updates prices etc.).
2) When there is such a person, ask to get in touch with them.
3) Ask this person what system is used to manage the website and a) how many pages there are (this is very important) and b) if a TMS can be linked to that system.
Just repeating: It is not difficult to localize content itself, but to implement it in the new pages, and - above all - you have to localize also the UI. How do you plan to do this? How do you find all those little parts in the navigation for example? And how will you implement the localized versions?
If you have excellent website development knowledge, you can do this with ease. If not, you need to work with the person who manages the website or with a website developer to be hired if necessary.


My experience on this matter is that by contacting a website administrator, it is possible to discuss possible extraction methods (xlsx, xml, po, or others). If it is processed with a CMS, the system should have a database that it relies on, and from here it is possible to think about an extraction. In short, it is a matter of patience and being able to talk to the right person, but before jumping into too time-consuming solutions, it is really worth applying some "relational" skills and finding a congruent operational solution.

[Edited at 2024-01-12 09:57 GMT]


 
trans-agrar
trans-agrar  Identity Verified
Germany
Local time: 23:58
English to German
+ ...
TOPIC STARTER
Any experience with JSON exchange format? Feb 6

Hello,
still on this matter, we have agreed on the following procedure: as MemoQ supports the exchange format JSON, the developer is now planning to provide the data for translation in this format. This will also provide the user interfaces (or in developer lingo 'the context'). But they say, that will involve extra costs for the client who owns the website. So we are waiting again. Does anybody out there have experience with JSON?

Thanks
Barbara


 
Zea_Mays
Zea_Mays  Identity Verified
Italy
Local time: 23:58
Member (2009)
English to German
+ ...
rolleyes Feb 6

I do not envy you.

https://docs.memoq.com/current/en/Places/javascript-object-notation-json.html


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Software for translating website content






Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »
TM-Town
Manage your TMs and Terms ... and boost your translation business

Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.

More info »