Translation technology | Blog | sa国际传媒 /category/translation-technology/ Nordic translation specialists Tue, 29 Nov 2022 19:00:18 +0000 en-GB hourly 1 From CAT tools to choir singing – Trixie’s 25 years at Sandberg /25-years-at-sandberg/ Tue, 10 Aug 2021 15:23:43 +0000 /?p=33318 On 16 June 2021, Sandberg marked the 25th work anniversary of Trixie Lignel Hauberg, Lead Danish Translator and the company鈥檚 longest-serving employee. Interviewed by Anu Carnegie-Brown, Managing Director of Sandberg, Trixie gave us an insight into the many challenges, changes and joys of working at Sandberg over the years. Paving the way to today Choosing ...

The post From CAT tools to choir singing – Trixie’s 25 years at Sandberg appeared first on sa国际传媒.

]]>
On 16 June 2021, Sandberg marked the 25th work anniversary of Trixie Lignel Hauberg, Lead Danish Translator and the company鈥檚 longest-serving employee. Interviewed by Anu Carnegie-Brown, Managing Director of Sandberg, Trixie gave us an insight into the many challenges, changes and joys of working at Sandberg over the years.

Paving the way to today

Choosing your career path can take time and a lot of thought, and that鈥檚 exactly what Trixie gave her decision.

Between finishing her college studies and deciding on a master’s course, Trixie spent four years making the most of her time with learning opportunities and gaining further experience. She completed a language course in Paris, worked with children and spent time travelling, among other things.

After those four years, Trixie enrolled on a Master鈥檚 in Specialised Language for Business. Upon completing her degree she was awarded the title of state-authorised translator. This paved the way for a career that would continuously challenge and reward her over the years.

In terms of work, Trixie found the perfect place for her in the translation and localisation industry. Her first job was with a software localisation company in Denmark, and upon hearing of an opening with Sandberg, she took the opportunity.

Twenty-five years later, Trixie is still here with us as our Lead Danish Translator and holds the proud title of longest-serving employee.

The 3 Ds

When asked how she would describe Sandberg in just three words, Trixie didn鈥檛 skip a beat: 鈥渄edicated, dynamic and diverse鈥.

Sandberg is dedicated, dynamic and diverse

厂补苍诲产别谤驳鈥檚 dedication is clear as day when you are always 鈥渟urrounded by people with a lot of enthusiasm for languages.鈥 Management also plays its part with a continuously positive and encouraging attitude.

Sandberg is dynamic in its ability to adapt to new situations, and Covid-19 has led to some of the biggest changes to working life. Long before this global pandemic, however, there were gradual but equally significant developments needed to keep pace with technological changes. Sandberg has always been quick to adapt to the latest tools, such as computer-assisted translation (CAT) tools and the use of machine translation (MT).

Finally, Trixie explains that 厂补苍诲产别谤驳鈥檚 diversity stems from the great efforts made towards representing an ever-growing variety of countries and cultures within its workforce.

Call the alliteration coincidence if you wish, but we think this just goes to show the depth of Trixie鈥檚 natural skill with languages.

A day in the life of a translator

For those of you who may not know much about a translator鈥檚 typical working day, Trixie gave us the lowdown.

The process begins with the project manager, who is responsible for assigning linguistic tasks. The translator then checks the content and deadline to make sure they have enough time to deliver their best work before accepting the job. Once accepted, it鈥檚 time to get started.

Translators will use various tools to facilitate the translation process, and CAT tools听 鈥 such as , or 鈥 give translators easy access to term bases and translation memories (TM) which help to accelerate the translation process and ensure a consistent, high-quality translation.

Once translated, a series of automated checks are performed in these same CAT tools. The translation is then sent to a second linguist for revision. If the reviser is a fellow 鈥渋n-houser鈥, terms and translation choices can be discussed together, which is great for developing linguistic skills and knowledge.

When the revision is complete, the job will either be sent back to the original translator for a final review or it will be delivered straight to the client.

What’s the problem?

I’ve had to swallow some camels – Trixie

Any job can throw out a problem to solve now and then. Interestingly, however, when it comes to translation work, Trixie explains that the hardest things she has had to deal with haven’t been work-related. Instead, the biggest challenge she has had to face concerns developments in the Danish language itself. To give an example, the Danish language is heavily influenced by English 鈥 what with the internet making the English language much more accessible to all.听 it can be difficult to not only implement the new English-esque grammatical structures, but actually accept them as being the 鈥渘ew normal鈥.

Even though she has had to 鈥渟wallow some camels鈥, as the Danes might say, Trixie was quick to reassure us that once you have mastered the various translation tools and systems used at Sandberg, the work itself is generally quite straightforward!

Changing with the times

In 25 years, Sandberg has been shaped and reshaped by all manner of changes. On the one hand there are those that are natural and gradual as the company develops; changes that are almost unnoticeable until you look back on things. But on the other hand, Trixie highlighted some changes that have been much more tangible.

Trixie shared her initial apprehensions concerning the game changer that is CAT tools. Like many other translators she feared that this technology would be detrimental to linguists. Would they suck the joy out of translating? Would they start to replace translators as these CAT tools became ever more sophisticated? Was the goal to remove the human from the translation equation entirely?

In the end, no. As it turns out, when it comes to CAT tools, it鈥檚 all in the name: they 鈥渁ssist鈥. As they became more integrated into the everyday workflows and translation processes, Trixie realised that CAT tools and MT can be very beneficial. These tools facilitate tasks with regard to consistency and speed of production, amongst other advantages.

Skilled linguists still have an important role to play

Even when MT can deliver a high-quality translation, it still needs human input and thorough editing before it can be delivered safely to a client. So, it is clear that skilled linguists still have an important and necessary role to play.

Hidden talents

One of the joys of working as a translator is that the projects are always varied and a new challenge is never far away. But just like with any job, you need some downtime.

One pastime Trixie enjoys outside of work is music: both singing and, on occasion, composing! And while Trixie is very humble when it comes to her musical talents, her skill as a composer and lyricist won her first place in a hymn writing competition!

If the past 25 years at Sandberg weren鈥檛 proof enough of Trixie鈥檚 loyalty and dedication, then her hobbies certainly drive the point home. Twenty-five years of hard work with the same company is a testament in itself. But, add 20 years of singing in the same choir and it鈥檚 clear that Trixie sticks with what she loves.

When things are bad鈥

鈥hey鈥檙e not that bad for long! According to Trixie, the best source of motivation is when things simply fall into place: projects and technology run smoothly, and the client gets a high-quality product delivered right on time.

But on days when a silver lining is particularly hard to find, you can always count on your colleagues at Sandberg. There鈥檚 always someone on hand to crack a couple of jokes or send a few choice memes that hit the spot and put a smile back on your face.

After 25 years, it鈥檚 clear that it鈥檚 really the little things in life and work that make it all great.


So, there鈥檚 really just one final thing to say: from all of us here at Sandberg, thank you Trixie for your continuous hard work and dedication over these last 25 years! Here鈥檚 to many more projects, challenges and good times to come.

The post From CAT tools to choir singing – Trixie’s 25 years at Sandberg appeared first on sa国际传媒.

]]>
A beginner鈥檚 guide to multilingual SEO /beginners-guide-multilingual-seo/ Thu, 29 Aug 2019 10:00:00 +0000 /?p=21013 We all know that web pages aren鈥檛 written for search engines 鈥 they鈥檙e meant for people. Findability is not an end in itself: it鈥檚 simply a means of bringing more readers through to your landing pages and blog posts. Yet we all know from our own experience of searching for things online that if something ...

The post A beginner鈥檚 guide to multilingual SEO appeared first on sa国际传媒.

]]>
We all know that web pages aren鈥檛 written for search engines 鈥 they鈥檙e meant for people. Findability is not an end in itself: it鈥檚 simply a means of bringing more readers through to your landing pages and blog posts. Yet we all know from our own experience of searching for things online that if something doesn鈥檛 appear on the first page of the search results, then it may as well not exist. (SEO) helps your pages rank highly in search results, giving them the visibility they need for potential readers to find them easily.

Visibility is only one aspect of boosting your online presence, however. Finding the right balance for content that ranks highly on search engines but is also engaging, visually appealing and business-focused is a team effort, involving SEO experts, digital marketers, content specialists and managers.

If your website鈥檚 only in one language, then your work is more or less done. But what if your website is in multiple languages for different markets, or if you鈥檙e targeting English speakers from different countries? If that鈥檚 the case, then you can add a localisation specialist to the list of professionals above.

In this article, we鈥檒l look at how to localise your search engine strategy if you鈥檙e planning to expand your business, using examples from the English-speaking world and the Nordic region.

Regional variations within the same language

The amount of SEO effort you need to put into your website depends primarily on the volume of content you create, although it depends on your business strategy too. It鈥檚 one thing selling products or services for English-speaking people in the US, but a whole different beast if your target market includes the UK too.

Let鈥檚 take the example of fictional online clothing retailer 鈥淵ourStyle鈥. It鈥檚 important to rank highly for the word pants听in the US and trousers in the UK (the word pants听in the UK typically refers to underwear). Like SEO, localisation is also about understanding the user鈥檚 intent 鈥 in this case purely from a language perspective.

To check whether you鈥檝e made the right wording choice, it鈥檚 always useful to double-check on a search engine. A good way to do this is with a VPN, which allows you to simulate an internet connection in any country. The image below shows a search for pants on the US version of Google:

A search for 鈥減ants鈥 on the US version of Google.

You can confirm trousers听is indeed the word you need to target the British market just by looking at the results Google offers. In both examples, Google offers images of the right product and links to retailers that searchers can click through to make a purchase.

A search for 鈥渢rousers鈥 on the UK version of Google.

SEO across multiple languages

Let鈥檚 return to YourStyle, our fictional clothing retailer. They鈥檙e looking to expand to new markets after a successful launch in the UK. Market research finds that consumers in the Nordic countries have significant spending power and represent the next big opportunity for YourStyle.

A common strategic mistake when entering the Nordic market is to assume that English will do 鈥 after all many Nordic people speak excellent English. The reality is that growing your business in this region requires you to build up an online presence in the local languages: Danish, Finnish, Icelandic, Norwegian and Swedish. If YourStyle really wants to truly penetrate this market, the old adage of speaking the language of your customers undoubtedly applies.

Here are some tips for properly localising your SEO strategy.

1. Think topics first, keywords second

There鈥檚 now broad consensus that individual keywords on your website aren鈥檛 what rank you highly: it鈥檚 about producing the best content with a dynamic range of terms and contexts on a specific topic. As SEO shifts towards this topic-based model, it鈥檚 important to define before defining your keywords.

In essence, the topic-cluster model 鈥 popularised by 鈥 is a way of organising content on your website so that search engines know you鈥檙e an authority on that specific subject. The model is built around a central pillar page (which acts as a 鈥渃ontent hub鈥 for a single topic) and multiple content pages on the same topic that link back to the pillar page and to each other.

If you鈥檝e already built your content with the topic-cluster methodology in your source language, then you鈥檙e off to a good start. The question then becomes one of which pages to localise. Many companies choose to translate only their pillar pages, which may seem like an obvious move as that鈥檚 where most conversions take place. But this doesn鈥檛 help your overall multilingual SEO strategy. The right approach is to translate all the pages in the cluster and replicate your SEO efforts in your source language in your target one.

2.听On-page keyword planning

Once you鈥檝e got a clear understanding of which topics you want to become an authority on, you can start planning and analysing on-page keywords. If you鈥檝e done your research properly, you will have chosen your keywords in your source language using the following criteria:

Search volume:听Search volume is a measure of the total number of searches made through a search engine expressed as the average monthly volume over the previous 12-month period. Search volume data is a crucial, fundamental element of your SEO strategy. If you鈥檙e localising your website from English into the Nordic languages, you can expect search volume to be lower: the combined population of the Nordic countries is around 24 million. On the contrary, if you鈥檙e localising in the other direction (into English), search volume can increase exponentially. But search volume alone says nothing if you don鈥檛 use it together with other metrics such as keyword difficulty.

Locale Keyword Search volume Competition
English (US & UK) e-commerce platform 1,300 High
Swedish (Sweden) e-handelsplattformen 390 High
Norwegian (Norway) netthandelsplattformen 10 Medium
Finnish (Finland) verkkokauppa-alusta 260 High
Danish (Denmark) e-handelsplatform 10 Medium
Icelandic (Iceland) netverslunarkerfi 10 Low

Source: Google Keyword Planner

Keyword difficulty: Determining the difficulty of a specific keyword requires understanding the level of competition. What pages are currently at the top of the rankings? What kind of content are they offering? Can you offer something better? SEO experts usually take a look at SERP (search engine result pages) history to know if competition at the top is too tight or if there a window of opportunity.

It might seem logical that if you can鈥檛 offer better content than those top pages, then that keyword is probably too difficult for you. That might be the case for English 鈥 but not for your other locales. The top results in your different locales are unlikely to feature the same sites as they do in English. It鈥檚 worth bearing in mind that , so naturally there鈥檚 a great deal more competition for English keywords. Analysing the SERP history for translations of your keywords can open up opportunities to shoot to the top of the rankings more quickly.

Search intent: Your webpage might be perfectly optimised and still not rank as highly as you want it to. Sometimes it鈥檚 not just about getting the core parts of SEO right, but also about understanding users鈥 search intent, i.e. what exactly it is they鈥檙e trying to achieve with their search.

What does this mean in practice? Google is now capable of evaluating your website鈥檚 user experience and uses this data to tweak its rankings. Let鈥檚 imagine you have two pages 鈥 both perfectly optimised 鈥 but one offers a free trial while the other doesn鈥檛. It鈥檚 highly likely that the page with a free trial will rank more highly over time. , SEMrush addresses this topic in detail.

It鈥檚 vital to remember that search intent and user experience can be as culturally relative as language. In some countries, it鈥檚 not realistic to expect every customer to enter their credit card details to start a free trial. In the Nordic countries, a good strategy is to not overburden your pages with triggers and banners: a minimalistic approach that embraces negative space is likely to be more effective.

3.听Text expansion and contraction

There are some technical aspects to search engine optimisation. One example is the character limit for SEO attributes like title tags or meta descriptions 鈥 Google typically displays the first 50鈥60 characters of a title tag and 155鈥160 for the meta description.

Even if you鈥檝e done an excellent job with your keywords in English, one of the issues you need to consider when translating the text from English into other languages is text expansion and contraction. These are two common concepts in translation and refer to the phenomenon of text getting longer or shorter when translated into a different language. When text is translated from English into Nordic languages, it usually becomes a bit longer and could potentially exceed the recommended limits for these SEO attributes.

Below, we show how a and can expand and contract. These are our examples in English:

Title tag: The best e-commerce platform for small businesses (49 characters)

Meta description: We鈥檙e not just an e-commerce app 鈥 we鈥檙e the best e-commerce platform that has everything you need to sell online, on social media or face-to-face. (147 characters)

And here is the example translated into Swedish and Finnish:

Translation Character difference
Swedish Den b盲sta e-handelsplattformen f枚r sm氓f枚retag -4
Vi 盲r inte bara en app f枚r e-handel 鈥 vi 盲r den b盲sta e-handelsplattformen med allt du beh枚ver f枚r f枚rs盲ljning online, p氓 sociala medier eller direkt till kunden. +15
Finnish Paras verkkokauppa-alusta pienimuotoiseen liiketoimintaan +8
Enemm盲n kuin verkkokauppasovellus 鈥 paras verkkokauppa-alusta, joka tarjoaa kaiken, mit盲 tarvitset myyntiisi verkossa, sosiaalisessa mediassa tai kasvotusten. +11

Here the Swedish actually used slightly fewer characters than the English, whereas Finnish needed more.


Key points

Generally speaking, the key to a successful multilingual SEO strategy boils down to two things: content volume and language diversity.

  • Ensure your SEO architecture in your source language is well structured before starting to localise it. Review Hubspot鈥檚 topic cluster theory to check whether you鈥檙e on the right track.
  • When picking content for translation, try to choose pillar pages and pillar content from the same clusters to increase your chances of ranking highly in your target languages.
  • Remember keyword difficulty varies by language, and thus the chances of you ranking at the very top of the search results.
  • Optimise traditional SEO elements and localise your keywords with search volume and character length in mind.
  • Partner with localisation specialists to make sure your keywords are translated with an understanding of how search intent works in the target culture.

The post A beginner鈥檚 guide to multilingual SEO appeared first on sa国际传媒.

]]>
The right way to use machine translation /the-right-way-to-use-machine-translation/ Mon, 05 Aug 2019 10:28:38 +0000 /?p=20725 The conundrum of what constitutes translation as opposed to post-editing of machine translation is one that has beset the language services industry for a few years now. Ever since machine translation started to become the norm 鈥 in both academic and commercial contexts 鈥 users of machine translation have been asking themselves whether or not ...

The post The right way to use machine translation appeared first on sa国际传媒.

]]>
The conundrum of what constitutes translation as opposed to post-editing of machine translation is one that has beset the language services industry for a few years now.

Ever since machine translation started to become the norm 鈥 in both academic and commercial contexts 鈥 users of machine translation have been asking themselves whether or not they鈥檙e doing something different mentally and practically when post-editing. This has also led researchers to ponder whether the language that鈥檚 produced from post-editing is actually a new one or simply a different type of translation.

What the research suggests

One such researcher is Antonio Toral, from the University of Groningen in the Netherlands, who has recently published a paper called . In it, he explains how he compared a number of post-edited texts with human translations of the same texts using simplification, homogenisation and interference as his main assessment criteria.

He found that post-edited texts tend to have lower lexical variety and lower lexical density, with sentence lengths matching the source text a lot more closely. These tendencies produce texts that are generally less varied and less rich. They also tend to be more homogenous and introduce significant interference from the source language. The sample size, metrics used and text types selected for this study have their limits, but it鈥檚 still quite interesting to observe this phenomenon.

The effects on the wider language

Language change is a natural process: it happens and it has happened regardless of machine translation, with factors like instant messaging, social media and a constant need for better, faster and more optimised communication being key drivers of this trend. Now we can add post-editing to the list of factors that influence how the language we read evolves and develops.

That said, an important note to make is that while things like instant messaging are entirely driven by human input, machine translation (and by extension, post-editing) is not that different. Yes, the text is produced by a machine, but the machine itself is trained on data that is generated by humans, so in a sense, the machine is simply replicating what we all write and say to suit a specific context. So while it is an innovation, it鈥檚 also solidly grounded in data collected the old-fashioned way.

The machine is simply replicating what we all write and say to suit a specific context.

One could then argue that machine translation is innovative in the way it recycles the language to reuse it when possible 鈥 a very 鈥済reen鈥 approach of not wasting any training data it has been fed. Data is, after all, the new high-value commodity in our modern world, and language data is incredibly important for any translation provider thinking about using machine translation to its fullest potential.

A different skillset for translators

At the end of the day, it鈥檚 easy to think 鈥 regardless of whether you鈥檙e post-editing or translating 鈥 that you鈥檙e just turning one language into another, right? While that might be true, the way you approach the task is substantially different on a fundamental level: when you post-edit, there鈥檚 already something there: you鈥檙e not starting with a blank canvas.

This may sound obvious, but it leads to a number of interesting habits for post-editors, one of which is the temptation to simply read the MT output and think 鈥測eah, that鈥檒l do, next鈥, especially if you鈥檙e pressed for time with a deadline looming. False translations, unidiomatic constructions and internal inconsistencies are among the most common examples of 鈥渦nder-editing鈥, so it鈥檚 important to always be careful and rely on good old-fashioned attention to detail.

It鈥檚 important to always be careful and rely on good old-fashioned attention to detail.

Oddly enough, this is complicated by the fact that the latest developments in machine translation, and particularly in , have led to great improvements in the flow and grammatical accuracy of the output: the language can sound so natural that it can trick post-editors into thinking that there is less to edit than there actually is.

This means that translators working on post-editing jobs should not underestimate the task at hand: yes, they do have the existing skills to be ready for it, but the process might be more mentally complex that they initially expect.

When is MTPE the right solution?

This is all well and good, but what should a buyer of translation services ultimately make of this information? And what should a language services provider take into consideration when offering translation and post-editing of machine translation?

It all boils down to the intended purpose of the text (and in turn, your buyer): homogenising the text might sound like a terrifying thought, but if you鈥檙e ordering the translation of a safety data sheet for a chemical product or a list of ingredients for a beverage, is the flow of the language really that important? Wouldn鈥檛 the opportunity to be faster and more productive when translating these texts with machine translation 鈥 which thrives on repetition and recurrent patterns 鈥 be far more appealing?

Wouldn鈥檛 the opportunity to be faster and more productive with machine translation be far more appealing?

And at the other end of the spectrum, if you鈥檙e dealing with a text that鈥檚 very creative, for example a client鈥檚 website that鈥檚 on view to the public, it might be preferable to consider a different approach. In this case, machine translation might not be the best solution and you should consider opting for transcreation for a better end product.

A good example of correctly used machine translation is usually an engine trained and used for a particular domain or text type. For instance, an engine built entirely with and for legal texts will generally perform well with the often formulaic and standardised terminology and constructions typical of that domain. Neural machine translation should also be the best solution here, since legal texts tend to have lengthy, verbose sentences that can be quite time-consuming to break down and translate manually without extra aid.

It鈥檚 safe to say that the decision to use MT should be made on a domain-by-domain and perhaps even job-by-job basis. If you want to know more about when it鈥檚 the right solution, download our free Guide to machine translation.

The post The right way to use machine translation appeared first on sa国际传媒.

]]>
Making machine translation work for us 鈥 part 3 /making-machine-translation-work-for-us-part-3/ Wed, 15 Aug 2018 19:10:09 +0000 /?p=16226 In the previous two parts of our interview with STP鈥檚 machine translation guru, Mattia Ruaro, we discussed different kinds of machine translation (MT), the way the technology is changing, and how it can and should be used in the translation industry. In this final part, Mattia shares his thoughts on how translators can use MT ...

The post Making machine translation work for us 鈥 part 3 appeared first on sa国际传媒.

]]>
In the previous two parts of our interview with STP鈥檚 machine translation guru, Mattia Ruaro, we discussed different kinds of machine translation (MT), the way the technology is changing, and how it can and should be used in the translation industry.

In this final part, Mattia shares his thoughts on how translators can use MT as a tool 鈥 and how STP is going about it.

You mentioned that editing machine translation output is a skill all of its own for a translator. How does it differ from translation?

I鈥檇 say that machine translation post-editing is not really that different from translation these days. Of course it鈥檚 quite different from translating a text from scratch in a word processor, but I think听sometimes people forget that translators very often work with translation memories (TMs) nowadays. So they don鈥檛 necessarily have a blank slate even without MT.

How does working with machine translation compare to working with translation memories?

It鈥檚 somewhat similar; essentially, you are editing matches in both cases. In the case of TM matches, a tool will suggest translations of similar sentences that have been translated before and stored in a translation memory file attached to the project.

The translator might, for example, have a 95 per cent match where only the punctuation is different to that of the sentence they are looking at 鈥 or perhaps there is just one word that is different. Translators have become used to editing TM matches. An MT match is often much less accurate, but it鈥檚 a starting point.

How does the process of post-editing differ from the process of translation? What does a translator need to know before starting this?

The biggest problem, particularly for inexperienced editors, is bearing in mind that MT output is the work of a machine, not a human. You can鈥檛 trust a machine the same way you can trust a translation memory match from a previous translator.

This seems like a fairly straightforward distinction 鈥 the clue is in the name. But many struggle to make this distinction.

Another thing is the amount of training, because there is very little training and resources available. This is why we recorded webinars for our freelancers, and all our in-house translators have received training too. We can鈥檛 give people MT output and expect them to just deal with it.

Machine translation post-editing (MTPE) is not as intuitive as people think: training, experience and knowledge are necessary. It鈥檚 really helpful to try to understand why the machine produces the output it does 鈥 but this is something that requires an understanding of technology.

From my perspective, it鈥檚 really helpful to have very specific feedback from translators, as training the engine requires precision.

You can and should be able to influence the engine quality 鈥 you can train the engine as well as the translator. If you 鈥減ut yourself in the machine鈥檚 shoes鈥, things start to fall into place.

STP is certified in MTPE according to the ISO 18587 standard. Why is this?

It shows the amount of effort we鈥檝e put into learning, understanding and using this kind of technology as a company. And this isn鈥檛 just the case for the technology team 鈥 our production teams have put in a lot of work as well.

Adhering to the standard is something we are doing with everyone鈥檚 best interests in mind; we鈥檙e trying to contribute to making a positive difference in the industry.

The standard is basically a set of guidelines 鈥 I would describe them as a collection of best practices. Basically, they raise the bar for everyone in the industry. Companies that care about these standards can promote them and counter the misuse of MT technology.

Do you think there is a lot of deliberate misuse of MT in the industry?

Some, certainly. There are companies trying to pass off raw MT output as translation and sending it out to vendors as regular revision projects, for example. But these agencies know what they are doing 鈥 and the revisers can spot this kind of thing a mile away.

There are some companies that lack information on the MT that they are using 鈥 or that they are expecting their vendors to use. They simply don鈥檛 know how good the MT output is, since they don鈥檛 have in-house people proficient in the relevant languages to check and provide feedback on it. STP only generates output for languages that we can check in-house. That way we know exactly what sort of quality it is.

Would you say that MTPE is faster than translation without MT?

There has been a lot of talk about MT improving productivity, but most of the research on this is done with very few people who are not working with strict deadlines. These circumstances do not really reflect the way in which translators work in the commercial world. The studies often make flawed assumptions too.

AT STP, we can test the effectiveness of MT as a tool internally. We have a lot of information on our translators and they already work with deadlines and under pressure, which makes them ideal test subjects.

How do you measure something like this accurately?

We have data based on edit distance 鈥 how different the final, edited output is from the raw, unedited MT output. In general, it seems that people are more productive with MT than without, though that doesn鈥檛 necessarily mean the quality is good.

How does STP measure machine translation productivity?

Basically, we are making an effort to track productivity gains. We are doing this by recording how much time projects where no MT is used take compared to MTPE tasks. It鈥檚 not the perfect metric, but we need some hard data on MT and how useful it actually is.

Is the difference that MT makes reflected in STP’s translation rates?

For us, it鈥檚 really not as simple as that. In terms of efficiency, we want to be sure we know what we are actually getting.

I see a lot of nonsense numbers being thrown around. For example, MTPE is supposedly 50% more efficient than translation. Even if there are time-saving aspects to this, it鈥檚 not realistic to put it in those terms.

The productivity increase needs to be contextualised as well. There are often other aspects that slow the work down, such as special instructions that need to be read and implemented.

At STP, we want to take into account the total effort people put into a project. And, at the end of the day, you still have to do the work 鈥 the engine just provides suggestions.

Based on the feedback we鈥檝e had from our translators, so-called 鈥渉igh fuzzies鈥, meaning TM matches that are ranked as a 75% match or higher by the CAT tool, are almost always more helpful than MT matches. So when our translators use MT, they are only using it for sentences where there are no 鈥渉igh fuzzies鈥 available. So far, this has been a useful approach for us.

The one thing that is perhaps different at STP is that we have over 70 in-house translators who can help us develop our approach.

How does having a large team of in-house translators help?

They are all professionals who have been trained to post-edit MT output, and they are happy to help us develop the engines further. I can understand how a smaller company might find this harder.

At STP, we work with a small number of languages on a daily basis, so that means fewer engines to worry about than some other companies.

If people are not happy with something, we can try to improve it 鈥 or abandon it if that doesn鈥檛 help. We can go back to the drawing board.

How do you work with the in-house teams in practice?

We have one person for each target language who is our go-to person for MT development. So far, we鈥檝e had this for all the Scandinavian languages and English. I work with these MT 鈥減ower users鈥, or MT experts, when I need feedback.

It鈥檚 easy to do this with translators who are genuinely interested in the process and the technology. The technology would not really be worth much to us without our translator teams 鈥 their effort is crucial in all stages of the process.

 


Learn more about听machine translation here.

The post Making machine translation work for us 鈥 part 3 appeared first on sa国际传媒.

]]>
Making machine translation work for us 鈥 part 2 /making-machine-translation-work-for-us-part-2/ Thu, 02 Aug 2018 09:14:09 +0000 /?p=11898 In part 1 of our interview with Mattia Ruaro, STP鈥檚 resident machine translation specialist, we talked about machine translation (MT) in general: how it works, how it has been used at STP and what companies can do to train the MT engines they use. In part 2 today, you can read Mattia鈥檚 thoughts on the ...

The post Making machine translation work for us 鈥 part 2 appeared first on sa国际传媒.

]]>
In part 1 of our interview with Mattia Ruaro, STP鈥檚 resident machine translation specialist, we talked about machine translation (MT) in general: how it works, how it has been used at STP and what companies can do to train the MT engines they use.

In part 2 today, you can read Mattia鈥檚 thoughts on the newest development within MT technology, which has people predicting the end of translation as we know it: neural machine translation.

So, Mattia, what is neural machine translation? And what鈥檚 with the hype?

Neural machine translation (NMT) is essentially the same as statistical machine translation (SMT), but there is more of a 鈥渂rain鈥 behind it. NMT can potentially improve itself over time and learn on its own.

The vital difference is the amount of data an NMT engine needs 鈥 which is way, way more than a traditional SMT engine.

Essentially you have nodes that establish connections on several levels, such as the context and clause level. This makes NMT more flexible 鈥 it can analyse shorter bits of text, so the flow of the target output tends to be better.

We often joke that when you train a SMT engine, you鈥檙e training a machine. Neural is more like teaching a child a language 鈥 or bringing up a bilingual child! While the engine is learning, it makes plenty of mistakes along the way, of course.

How does NMT output compare to previous technologies?

The first thing is better fluency. The output from an NMT engine tends to be more idiomatic, meaning it reads more like natural language. More often than before, the engines are able to use an appropriate synonym or expression within the context of the sentence at hand.

Adapting to the immediate context helps a lot with languages like German or Danish that have complex syntax. Subclauses separated by commas are interpreted more accurately, for instance.

One key aspect of NMT is that it interprets morphology better. For example, a verb in the first person would usually be rendered as an equivalent verb in the first person. So, if the source says I write in English, the target would be 箩’茅肠谤颈蝉 in French, with the correct ending. If the engine cannot recognise the person, it will give you the next best thing, which is usually the verb in the infinitive (for example 茅肠谤颈谤别). This is then easy to edit manually.

We talked about training MT engines before. How does training NMT engines differ from SMT and RBMT (rule-based machine translation) engines?

NMT needs a lot more data than SMT and RBMT. The biggest hindrance to adopting NMT in the first place is that smaller companies can鈥檛 find enough data. To get started, a NMT engine needs at least 10听million words of data.

By comparison, an SMT engine can be good as long as the data is good; you can get a decent SMT engine with as few as a million words.

So, NMT is much more about quantity over quality in this respect! Just to put this into perspective, our Finnish NMT engine has 140听million words right now.

Another thing is training the engine. NMT engines tend to resolve issues themselves based on data you add 鈥 they come up with rules. You can still add rules if you want, but sometimes this can be counterproductive 鈥 you risk doing too much, being too strict.

For example, a German to English translator at STP was wondering why the German-English engine was translating personal names. It turned out that these specific names were also all meaningful nouns (such as the surname 惭眉濒濒别谤, which means 鈥渕iller鈥). This means we had to consider the need for a new rule carefully, since the noun 惭眉濒濒别谤听(capitalised, like all nouns in German) might come up in a text about millers later.

In this case, leaving it alone and replacing the translated name manually each time was the easiest thing to do. It鈥檚 an easy mistake for the translator to spot. You see the error, you check the source and you fix the output accordingly. No one is expecting the output to be perfect.

Will NMT replace human translators?

A hundred times, no! A technology like this is only as good as the use you make of it.

I could imagine a situation where a company with several offices around the world would need internal communications, such as short messages from HR, translated very quickly. These could be run through a specialised engine the company has developed and trained for that purpose. The translation wouldn鈥檛 be high quality, but people would get the gist. But this would be internal communication and nothing customers would ever see 鈥 just for information purposes. Another example is using MT to translate large amounts of survey responses for market research purposes.

But this is not how it鈥檚 been used or how it is perceived by many. Many early adopters of machine translation have misused the technology, which has affected its reputation.

The key thing is to use MT output appropriately. Professional translators can use it as a tool. It has even been suggested that post-editing output produced by a MT engine could be a separate service one provides as a translator, as long as you know what you are doing.

Translators are not being replaced; it just that the way they work is changing.

Does NMT technology work differently with different language pairs?

It seems it has done, for some language pairs. For instance, English-Japanese is working quite well, which I find quite impressive. Nordic languages have not been concentrated on much, as they are smaller.

German output seems to suffer from the syntactic complexity and strictness of the language, and capitalisation is a huge issue. Romance languages seem to be working fairly well; NMT engines seem to cope with their verb paradigms and tenses.

Rather than the language pair, the issue is more the target language itself. Obviously Finnish has been a bit of a headache for us.

Why is Finnish more difficult for NMT?

I think morphology is more important, the grammatical complexity within words. The engine will have a harder time discerning the different parts of a word.

The Finnish case system is a real challenge for the engines. Each case ending is a variable, and you need to consider this variable in every scenario. Finnish has 15 different cases and there are several possible endings for many of those cases, which means there are a lot of potential alternatives.

So far, I have only heard of one company making a Finnish engine work really well in the terms of the morphology and fluency. And that can only be achieved by specialising in one language.

How costly is neural machine translation? Is it worth investing in NMT?

Very costly. You need powerful servers to operate the amount of data we鈥檙e talking about. If SMT is like driving a car, NMT is more like flying a jet 鈥 the fuel costs are much higher. It鈥檚 a lot more affordable now than it was before, though. More and more options are becoming available and prices are falling.

In terms of cost-efficiency, I would say that, if used correctly, MT has the potential to really speed up translation in established workflows.

How secure is MT in general and NMT specifically? How can we be sure that personal data and other data is safe?

It’s as secure as you want it to be. It depends on who deals with your engines and how. We have third-party technology, but we鈥檝e checked their locations and their background.

We also clean the data to keep it secure so that no personal data gets used to train engines. Even Google no longer reuses the data you send back to them. For a while now, they have limited themselves to the data from Google itself rather than using the final output from the translators.

In other words, I think machine translation is very safe.

 

In part 3 of the interview with Mattia, we will talk to him about the practice of machine translation post-editing and how translators can learn to edit the output from MT engines.

 


Learn more about听machine translation here.

The post Making machine translation work for us 鈥 part 2 appeared first on sa国际传媒.

]]>
Choosing the right tools: the importance of investing in translation technology /choosing-right-tools-importance-investing-translation-technology/ Thu, 19 Jul 2018 01:15:56 +0000 /?p=11876 There is no such thing as a successful modern language service provider that does not invest in sound translation technology. The modern translation pipeline 鈥 from quote to translation, revision, quality assurance checks and, finally, to delivery 鈥 would be a chaotic mess without the appropriate technology: translators would repeatedly translate similar texts from scratch, ...

The post Choosing the right tools: the importance of investing in translation technology appeared first on sa国际传媒.

]]>
There is no such thing as a successful modern language service provider that does not invest in sound translation technology.

The modern translation pipeline 鈥 from quote to translation, revision, quality assurance checks and, finally, to delivery 鈥 would be a chaotic mess without the appropriate technology: translators would repeatedly translate similar texts from scratch, it would be difficult to ensure consistent use of terminology, and it would be extremely complicated to check for errors in a project. It would take much longer to translate formats such as PDF, XML or InDesign, and there would be no such thing as real-time collaboration.

Translators and language service providers (LSPs) use computer-assisted translation (CAT) and translation memory (TM) to increase productivity and improve the translation process.

It is also increasingly common to see machine translation (MT) integrated into translation management systems and workflows. MT can be a helpful tool despite the raw machine translation often lacking in accuracy and fluency. However, with the continued rise of neural networks powering MT, the industry is expecting quality to improve and adoption to increase.

Translation technology is all around us, but ubiquity is not the main argument in favour of investing in it.

Efficiency, compatibility, scalability

Productivity is the primary driver behind the adoption of CAT tools, TM and MT. Efficiency and effective project management are the key reasons for the spread of commercial and proprietary TMSs.

According to translation and interpretation forum ProZ.com, their members have seen in recent years, citing more experience as a key factor. Many have also said that improved terminology resources and proficiency with TM tools has helped boost their productivity.

Today, the proliferation of a translation technology tools means that LSPs have yet another incentive to make use of them: compatibility. They need to be able to cater to and match the tools used by their clients and, if possible, their translators and interpreters.

Translation technologies have developed separately and are driven by different forms of demand 鈥 productivity, effective project management, shrinking deadlines, automation, cost reduction etc. However, as part of a unified technological infrastructure, they all work towards shared goals.

The various technologies make a translation company efficient, compatible and scalable; able to meet the various demands of its client base and deliver continuous service consistently.

The language industry research firm Common Sense Advisory (CSA) puts it succinctly: 鈥淟anguage service providers have .鈥 CSA is spreading this message more widely, recommending a move away from simple translation-centric business models to content-centric ones and advocating strategic roles, and this also applies to the tools that allow LSPs to manage complex content needs from clients.

At STP we work with over 15 commercial and proprietary CAT tools on a regular basis. This mature translation technology ecosystem lets us handle a variety of different tasks from over 400 clients without breaking a sweat. Jesper Sandberg, STP鈥檚 Executive Chairman, explains that this works for STP: 鈥淔or a language service provider of STP鈥檚 size, it is vital to have trustworthy translation technology providers. In May 2012, STP adopted memoQ as our primary CAT tool. A lot has happened since then; many new tools have appeared in the market and others have managed to catch up with their competitors.

鈥淲e are capable of working with most of these tools, but if given a choice, we prefer to work with memoQ. When you have multiple teams spread across different countries, it is easier to become power users of a single tool rather than many, and thus get the optimum return on your investment鈥.

People, process, technology

At the centre of this business model supported by translation technology, we have the people who make the projects happen. The technology makes translation companies more efficient. It empowers them to be endlessly compatible with client needs and enables them to scale their success as they grow.

However, when it comes down to it, technology is only tools. Their potential is unlocked by the people who use them. So it is not only about having the right technology, but also about training your people to use it properly. The ProZ.com members who reported that translation technologies improved their productivity will have spent time learning to use CAT tools and TMs. There is a learning curve initially, but the payoff is worth it.

Human-machine interaction is an important factor in determining whether the tools help the people become better at their job. For a language service provider, investing in translation technology also means investing in training. Raisa McNab, STP鈥檚 head of Learning and Development, explains the importance of translation technology in our recruiting and training processes.

鈥淒uring our induction we do not just focus on training linguists how to use memoQ 鈥 we also teach them the fundamental concepts of what CAT tools are and how they work, so that it will be easier for them to pick up other tools later. We do not require new recruits to be familiar with CAT tools before they join STP, but those who have had previous experience or exposure to CAT tools during their studies have a head start. Knowing the basics of CAT tools definitely makes the initial learning curve less steep. Our Tech team runs the training programme and they also ensure new recruits continue to have access to further training resources once the induction stage is over.

鈥淲e also have an extensive Knowledge Base of CAT user guidance and workflows linked to learning units and recorded webinar tutorials, which are tied to a new recruit鈥檚 development and monitored by their team leader and manager.鈥

Up-to-date and up to the task

In the modern language services landscape, allowing your technological infrastructure to become obsolete is sounding your own death knell.

Each of the companies behind the various translation technologies used today strive to keep their software and tools updated. They incorporate new data, new technologies, improved capabilities and additional concerns driven by real-life needs such as better security and data risk management.

LSPs need to keep abreast of technology as well. This means leveraging new features and reviewing workflows to optimise processes and automate as much as possible. Adam Dahlstr枚m, STP鈥檚 IT Manager, explains the processes we follow to stay up-to-date with CAT tools.

鈥淲e store our linguistic assets on our memoQ server and distinguish between maintained and temporary resources using a naming convention and metadata structure. This helps us automate access to relevant language resources, while ensuring maximal client integrity and data security. Whenever new features are released in memoQ or other tools, we investigate them, but also document and provide training for our staff as needed.鈥

Staying current is not just a matter of clicking a button and installing software updates. It is all about keeping the entire infrastructure working at optimum efficiency.

Investing in the right translation technologies is a requisite standard in today鈥檚 modern language services landscape. The key to doing this successfully is choosing the technology carefully, planning and executing continuous training and support for the tools, and being aware of technological trends.

The post Choosing the right tools: the importance of investing in translation technology appeared first on sa国际传媒.

]]>
Making the most of Machine Translation /most-of-machine-translation/ Fri, 15 Jun 2018 10:59:14 +0000 /?p=11817 It鈥檚 pretty much twenty years I鈥檝e been in this industry, from when I first started a degree in translation, naively thinking I would and could be a professional translator, to spending the best part of the past ten years doing production management, business and IT development and training. In that time, machine translation (MT) has ...

The post Making the most of Machine Translation appeared first on sa国际传媒.

]]>
It鈥檚 pretty much twenty years I鈥檝e been in this industry, from when I first started a degree in translation, naively thinking I would and could be a professional translator, to spending the best part of the past ten years doing production management, business and IT development and training.

In that time, machine translation (MT) has gone from:

鈥淣ot there鈥 to

鈥淵eah right, ha ha, never going to happen鈥 to

鈥淥h, this client鈥檚 doing it, but it鈥檚 pretty awful鈥 to

鈥淢aybe we should consider doing it?鈥 to

鈥淒oing it, and it鈥檚 not that bad鈥 to

鈥淎ctually, it鈥檚 just another productivity tool鈥.

These days, we see a lot of MT at STP. Our excellent Technology team develop and maintain a host of MT engines for our internal use. We get MT output from clients and end-clients, and it ranges in quality and type from pure Google Translate to highly customised account-specific engines. What has been interesting is that companies have almost exclusively wanted a product which is full human quality.

If you ask me, the bottom line with MT is that when it鈥檚 used correctly, it allows us to translate more content faster, and within the same budget than before MT. And that鈥檚 great, it means that our target languages aren鈥檛 particularly threatened by English, as companies continue to see the value in producing content in their customers鈥 native tongues. For someone with a degree in Finnish translation, that鈥檚 a nice thought 鈥 there are only 5.5 million of us Finns after all!

What has become abundantly clear in the past few years of STP ramping up our use and development of MT is that our linguists鈥 MT post-editing skills are at the core of our ability to produce that full human quality. And that requires training.

This spring, we were certified to ISO 18587 on machine translation post-editing. This is a new ISO standard that has been developed to address the requirements for post-editing skills and training, rather than the technical development or implementation of MT engines. It鈥檚 not a particularly onerous standard to meet, provided that you are running a legitimate operation.

What the standard does do, though, is put the onus on the language service provider (LSP) to provide appropriate, robust training which ensures that the linguists working on MT output know how MT works, how post-editing is different to editing translation memory matches, how to give feedback and improve the engines efficiently, and how post-editing is best approached. And I think that鈥檚 the least we owe our translators.

And what being certified to the standard does is that it tells not only the outside world but also our clients and translators that we as a company know what we鈥檙e doing with MTPE. It tells them that our linguists are trained and know what they鈥檙e doing with MTPE, and that, essentially, it鈥檚 safe to trust your MT in our hands 鈥 what comes out the other end is another great STP translation.

I am sometimes a bit jealous of our translators who have made my old dream a reality, especially when it comes to figuring out how to use technology in the translation process. That said, I realised a long time ago that I would have at best been a mediocre translator, so I鈥檓 glad I found my calling on the business side of things. I certainly wouldn鈥檛 want to move to another industry, that鈥檚 for sure!

Raisa McNab is STP鈥檚 Learning and Development Manager and the ATC鈥檚 Lead on Standards. She holds an MA in Translation from the University of Turku in Finland.

This article first appeared in the June 2018 edition of STP’s Icebreaker newsletter.

 


Learn more about听machine translation here.

The post Making the most of Machine Translation appeared first on sa国际传媒.

]]>
Talking about automatic speech recognition /voice-recognition-lsps/ Tue, 06 Dec 2016 13:39:27 +0000 /?p=10526 By Ryan Bury, English in-house translator Whether you鈥檙e chatting with Siri, Cortana or Alexa, modern technology invariably gives users the chance to 鈥 quite literally 鈥 find their voice. And it鈥檚 no different in the language services industry. Automatic speech recognition (ASR) has been around for over half a century under various guises, and flexing ...

The post Talking about automatic speech recognition appeared first on sa国际传媒.

]]>
By Ryan Bury, English in-house translator

Whether you鈥檙e chatting with Siri, Cortana or Alexa, modern technology invariably gives users the chance to 鈥 quite literally 鈥 find their voice. And it鈥檚 no different in the language services industry.

Automatic speech recognition (ASR) has been around for over half a century under various guises, and flexing your vocal cords to 鈥榳rite鈥 a text is nothing new. But where does the potential of this c?

Automatic speech recognition to boost productivity

There鈥檚 no doubt that under the right circumstances, ASR can dramatically boost that magical words/day figure, with anecdotal evidence at a suggesting that 10,000 words per day wasn鈥檛 out of reach.

Naturally, even a fraction of this increase would be music to the ears of company bosses, with enormous potential for gains both in terms of output and, of course, profit.

Health concerns have also played a major role in the increased popularity of ASR technology, with the avoidance of tens of thousands of keyboard taps per day a particular plus point for translators and their handiwork.

Indeed, various such ergonomic solutions have made everyday life much more comfortable for employees of companies such as STP. However, is the process of actually integrating this tool trickier than it might appear?

Using automatic speech recognition with other tools

For a translation company that already uses a plethora of tools, ASR must find a valuable and supportive place within the existing production process.

If translators have translation memory content and/or machine translation suggestions as a starting point, they need to be able to weave ASR into their workflow seamlessly to truly benefit from its productivity-boosting potential. And it mustn鈥檛 hinder any gains that would otherwise have been made through traditional TM leverage or MTPE.

With this in mind, the adoption of ASR 鈥 or even ASRPE, ASRMTPE, or any other such Scrabble-worthy acronym that might emerge 鈥 is no small balancing act.

Is automatic speech recognition practical in the office?

A translation company must also consider the logistics of introducing ASR into its toolkit and ask whether it can truly prosper in an office-based environment.

Background noise can prove distracting not only for the translator, but also for the tool itself. So is it possible to have several voices translating at once and still retain the productivity benefits on offer?

Then there鈥檚 the old clich茅 of the introverted translator. For those who work best when fully immersed in a text, and with minimal outside distractions, the switch to the infamous anti-concentration fiend that is a noisy workspace could be a troublesome one.

These days, of course, remote working is becoming ever more common, and linking these two phenomena could well pay dividends. As a remote worker myself, I can safely say that 鈥 notwithstanding the occasional dog bark 鈥 it is generally much easier to make a home environment suitable for ASR than it would be in an open-plan office setting.

Evidently, there are a great number of issues to consider for any LSP with regard to voice recognition, but those productivity-related whispers may ultimately prove impossible to silence.

This post first appeared in the .

The post Talking about automatic speech recognition appeared first on sa国际传媒.

]]>