I look forward to the day where I'm wearing my headphones in a foreign land and hearing all of the discussions in my own language.
The "universal translator" which was part of Star Trek and a lot of other Sci-Fi I was exposed to as a kid was something I was really fascinated with. My Dad worked as a simultaneous French->English translator and sadly spent long hours away from home and, as a kid, I started trying to build a translator so that it could do his work and he could be home more.
Translation is important work and one that could help a lot of people. It's my hope that we get to the point where these models work entirely on locally carried resources.
how am i supposed to talk shit with my friends about other people in public then
I'm curious to know how well these models can pick up slang. Maybe if you talk shit in as thick a slang as you can it won't be able to give a good enough translation.
With my bi/trilingual friends who speak the same languages, we intermix them to make our point more clear. Don’t think models will be good enough for mixes for a few more years, so we’re safe!
Can you show us an example of such a sentence?
Hm, think of things like “On va bruncher” (we’re going to brunch). The word “brunch” doesn’t exist in french, but we add suffixes to fit into the sentence. Very common in Montreal. My french isn’t very good to do that on the fly, but my francophone friends do that all the time.
In my other languages that I am actually fluent in, it’s kinda the same — you use specific suffixes to soften or embolden your point and so on. Maybe add “exclamation making sounds in specific language” too. Eventually your nouns and verbs end up in different languages, with different suffixes where it “makes sense”, yet the person whom you’re talking to will “get it”.
Would be curious to try the new Seamless model on such speeches.
I would thing this model would fail with a heavy quebecois lingo, as opposed to standard French.
This is extremely common for every new technology: “upload,” “download,” “stream,” “google,” “FaceTime,” most code patterns, all the new ML apps, “venmo” or whatever the name of the app you use for payment, etc. all of those are taken as is, slapped a verb termination and it’s good enough. That’s true in German, Danish, Dutch, French, Italian, and Spanish.
The only thing that doesn’t work is if you talk to people too young to remember Skype. Then you feel old.
Reinventing polari is certainly one way to make yourself less understood...
Cockney English and French Verlan comes to mind.
I don’t know for cockney but verlan is very alive.
I'd love to see a map of how it matches up to regional English/British accents and their slang.
learn Klingon?
Klingon is definitely going to be in the top 50 languages covered…
Speak in metaphor and/or code.
I’ve been in mixed language communities in which I wasn’t sure who spoke what, and I have found this to be quite effective when done right.
Good time to reference st:ng “darmok” episode and quotes like “darmok and jalad at tanagra”.
Cincinnati when the Turkeys fell.
get better at double speak https://en.wikipedia.org/wiki/Doublespeak
If I am not wrong, Google Pixel buds offer live translate feature.
Not in the voice of the original speaker.
now if I could just get the pixel buds tech to remove the voice of the original speaker and translate some youtube videos from thick accent english into no accent am-english.
Obligatory, not directed at you in particular since I'm sure you mean no offense, but just voicing a pet peeve:
I grew up bilingual outside the US, and speak English with a hybrid British/Indian/Middle Eastern accent (with some of my personal quirks, and mixing increasing amounts of various American accents over time). I can understand English in nearly any accent (Singaporean, Chinese, Vietnamese, Indian, Nigerian, eastern European) as long as the words involved are globally used and the grammar is passably queen's. Especially after hearing it for about an hour. And people who natively speak English with these various accents usually can understand my English better than they can an average American accent. Yet in this country, my accent is belittled, despite being perfectly understood and more versatile. Even by others who don't speak with the American accent!
This is the problem of the "default accent" anywhere being referred to as "no accent", and therefore anything deviating is considered "having an accent". This makes "accent" a negative trait, scaling from 0-bad to heavy-bad. But if the vernacular were such that we said "American accent" instead of "no accent", then noone's accent is bad, just not used to.
Most of my non-American peers who were raised on English have a better command of the language than my American ones, yet they are mocked for their accents as if they don't know the language, when in reality it's the Americans lack of familiarity with the language (as its used globally) preventing them from comprehending the language.
So yes, put in more work, the world is shrinking and English is the global language (for better or worse). What you're saying is spoken from a position of privilege because the culture allows you to mock others' accents and imply your version of it is the correct one that everyone else should put in work to provide you with, rather than the other way around.
Every time you hear English with an accent other than British, American or Australian, remember that it usually means the speaker knows at least one entire other language as well, probably one that you would sound like an idiot if you tried to speak it. Don't be rude or dismissive of their command of English.
In fact, you were so close — you called it a "no accent am-english", when you could have just called it what it is — "an american accent".
I appreciate your sharing, and stating that you assume I meant no offense, and that your thoughts are not directed at me specifically.
I could of been more specific, but my request for the tech to vary, I think would lead to specific options for different people.
And actually to be even more.. not sure the word.. I want 'the Chicago accent' I think it's called, or midwest / no accent. Personally as much as I enjoy some entertainment from Jersy / NY accents, I would not volunteer to watch tutorials on tech taught by the Sopranos cast - as funny as that might be (and I get if you are from the NE, you may be learning just fine being taught with such a language style).
As annoying some of the Cali style of language is, I can understand the words and meanings without squinting my ears and spending double the brain cycles trying to understand the words, while then interpreting the meaning, and then trying to put together concepts for understanding new ways of coding or using tech.
I've run into folks in Louisana that I could not understand at all and had to ask for an interpreter at a gas station. From Florida to Chicago to Seattle down to Miss and Ala - I can hear what people are saying and learn without spending lots of extra energy trying to understand.
With that being said, I understand there are parts around Miami where accents may be thicker (or not) - and with some folks even if using the rights words and grammar, I may need to slow down the speech to actually learn if they were teaching a class.
The slow down and speed up options already exist with youtube.
"So yes, put in more work"
- I do try a bit. I don't mind accents with some folks and media.For example I can listen to and enjoy Shankar sharing via the 'hidden brain' series, partially because his accent is limited but also because the media requires less thought intensity.
I have tried many youtubes, and bought a few courses taught from folks in India and other places where I just could not muster the energy. I literally squint with my ears and feel like my head gets hot trying to decipher what is being said, translate into what is meant, and how it should create new patterns of understanding in my brain.
I can only do that for so long and I am done. Now I just skip any learning video that has non-am English speakers. When I consider courses to sign up for or buy, I have to research the authors / speakers and find video of them to hear the audio, because I just can't learn well that way.
"other than British," - True story, a few years ago I had to call an ISP in Britain(?) and the person I got to to file an issue with, I could not understand them. I had ask 'what did you just say' many times. I laughed at myself for even thinking of saying 'can you slow down and speak clearer English please' - I mean, crazy... I was paying by the minute for the long distance at the time and it ended up being a 25 minute call that could of been 10 if I had a magic translate without accent device.
"a position of privilege because the culture allows you to mock others' accents"
- This is truly not about mocking accents, this is truly about my lack of ability to learn well.
Yes, I would defintely sound like an idiot trying to speak another language. Like I said, I do not learn as well as some others.
Truly not my intent to be rude. I apologize if the shortness came off that way, I was trying to be brief in the hope that there's a chance that some tech like this exists and someone here could point me to it. Before I posted, I DDG'ed it and found a couple of things attempting to be in that space with a 'speak to sales' type of 'you'll never afford this' button for info.
I will never be dismissive of anyone's command of English, or other spoken language, or computer language or anything like that. There is no way for me to know someone else's situation and circumstances led them to their current command of whatever language. If someone is trying to learn more at any age; I applaud and encourage them - being rude or dismissive does not encourage more learning.
"no accent am-english", when you could have just called it what it is — "an american accent". - Well maybe, but actually I meant to be more specific, as mentioned a bit above - I mean '"no accent" American accent' - because there are plenty 'American accent' types that I would want removed by a magic earpiece to make it easier for me to understand and learn.
I appreciate the thoughtful reply. I don't think you're rude, and I get what you're saying as someone who thinks a lot about accents and languages. However, I still think you missed my point.
There is no "no accent". An accent is a baseline feature of intelligible human speech, like a voice, or a volume, or a language. You can't say stuff without those features. When you say "the Chicago accent", or the "Midwest accent", that's an accent! Not "no accent".
I understand it's common usage to refer to the default "radio accent" as "no accent", but in a country like America, all kinds of people with all kinds of accents speak English. Reinforcing an expectation that a certain (usu. majority-white-spoken) one is the "default" by referring to it as "no accent", implicitly suggests all others are erroneous affectations, even if I trust that is not your personal intent.
All that said, I think your idea for a translation device capable of revocalizing what is said with an unfamiliar accent into one you are used to is not a bad one, and likely easier than translating between languages while retaining expressiveness.
Wow, you just keep digging in don’t you? When these Americans you deride say “no accent”, do you think they are referring to the “majority-white-spoken” Scottish accent?
No, of course not. Get that race baiting out of here.
https://www.bbc.com/culture/article/20180207-how-americans-p...
What accent? Whose accent? Brits are as diverse accent wise as Americans, London, cockney, New England, Southern...
A lot of Indians that I know have a very "proper" British accent, one that maybe a bit aristocratic, its quite an irony for a former colony. https://www.bbc.com/future/article/20220915-what-the-queens-...
The context matters, but so does history.
There is another way of looking at this, in the context of the parent post: we could suggest that any accent could be converted to “no accent” where American accents are converted to British, or where standard Japanese is converted to a Nagoya pronunciation. Whatever seems like your preference of “no accent”. With this interpretation of the parent post, it’s not specifically about any particular English accent. I’ve been told by others that I have an accent yet I think I don’t have one - and honestly, I think most people have either encountered this - having an accent when you think you don’t have one - or haven’t travelled enough! :)
And I mean, yes, there are people who know they don’t sound like whatever ideal accent they have in mind, and there are people who will make fun of accents - but, and I can’t stress this enough, depending on the context literally any accent can be made fun of, sadly. I’ve had people mock my “American” accent while travelling, for example. It sucks, but it’s not easy to single out any accent as “default” unless it’s literally enforced by a government and taught that way in schools. Last I checked, the US is not one of those countries and English is not as centrally controlled as e.g. French can be.
This would carry some weight if you didn’t take an opportunity to take a shit on Americans’ English in the middle.
There are many american accents. Your suggestion makes the sentence much less clear.
And by specifying "american" they're already making it clear there is no such thing as a universal base accent for english.
This is a really interesting use case. I could definitely see this as a service for content providers to get more reach and I think you could justify a subscription price for the service based on this.
By keeping creating speaker specific tonal ranges and profiles you maintain the better cohesion on the final product.
It would be really cool as an assistance in practicing correct pronunciation and accent. Hearing your voice saying it right and then hearing how you actually said it the last time you tried might help you to get both into alignment.
I worked on building exactly this earlier this year. I was hanging out in Taiwan for a few months and thought, surely the Babel Fish should exist by now.
I did several experiments recording from all the microphones I could on my iPhone and AirPods while out in the wild. My conclusion: it's impossible right now for that hardware given the microphones we have and what they pick up.
So much of what's spoken is at a combination of (a) high distance (b) low volume (c) background obscuration. Something that was clear as day to my ears would barely register on the mics. While context is of course an issue, the raw audio didn't have enough to even translate.
The one caveat is that there might be low-level (i.e., Apple-only) access to headphone microphones that capture the environment to do noise cancellation. I'm not sure though---I couldn't find them on any API.
For cases where you do have clear audio, existing apps (e.g., Google Translate) are so close to achieving this, but don't let you specify audio outputs with enough fine grained control. By default, it will start screaming out of your phone what you were attempting to silently translate.
There's also some magic to the Universal Translator and Babel Fish: they perform zero-shot real time translation.
That is, they are able to translate (in all directions) novel languages that were not previously heard[0]. It is an open question, with likely a negative answers, that there is a universal grammar even among humans[1] (the definition itself is vague but even the most abstract version is suspect and highly likely to not be universal across species). I think no one will be surprised if it is always impossible to interpret an entire language based on only a few words (let alone do it in real time)
This isn't a knock down, because even a trained device is insanely useful, it's just a note about limitations and triage. This is awesome stuff and I can't wait for the day we have transnational headphones. It's an incredibly complex problem that I'm sure is not short of surprises.
[0] There are a few exceptions such as Star Trek TNG's episode Darmok, S5E2, where the Tamarians' language is unable to be translated due to its reliance on cultural references (the literal words are translated but the semantic meanings are not). It's a well known episode and if you hear anyone saying "Shaka, when the walls fell" (translates to "Failure") they are referencing this episode (often not using the language accurately but who cares (nerds. The answer is nerds)).
[1] https://en.wikipedia.org/wiki/Universal_grammar
Can’t speak for ST, but did they ever say the babel fish understood languages it never heard before? I thought the galaxy was just exceptionally well-cataloged, given the HHG itself, and humans were hardly unknown.
The babel fish translated via brainwave energy and a telepathic matrix:
> The Babel fish is small, yellow and leech-like, and probably the oddest thing in the Universe. It feeds on brainwave energy received not from its own carrier but from those around it. It absorbs all unconscious mental frequencies from this brainwave energy to nourish itself with. It then excretes into the mind of its carrier a telepathic matrix formed by combining the conscious thought frequencies with the nerve signals picked up from the speech centres of the brain which has supplied them. The practical upshot of all this is that if you stick a Babel fish in your ear you can instantly understand anything said to you in any form of language. The speech patterns you actually hear decode the brainwave matrix which has been fed into your mind by your Babel fish.
“Now it is such a bizarrely improbable coincidence that anything so mind-bogglingly useful could have evolved purely by chance that some thinkers have chosen to see it as a final and clinching proof of the nonexistence of God.
“The argument goes something like this: ‘I refuse to prove that I exist,’ says God, ‘for proof denies faith, and without faith I am nothing.’
“‘But,’ says Man, ‘the Babel fish is a dead giveaway, isn’t it? It could not have evolved by chance. It proves you exist, and so therefore, by your own arguments, you don’t. QED.’
“‘Oh dear,’ says God, ‘I hadn’t thought of that,’ and promptly vanishes in a puff of logic.
“‘Oh, that was easy,’ says Man, and for an encore goes on to prove that black is white and gets himself killed on the next zebra crossing.
“Most leading theologians claim that this argument is a load of dingo’s kidneys, but that didn’t stop Oolon Colluphid making a small fortune when he used it as the central theme of his best-selling book, Well That about Wraps It Up for God.
“Meanwhile, the poor Babel fish, by effectively removing all barriers to communication between different races and cultures, has caused more and bloodier wars than anything else in the history of creation.”
I couldn't help but hear this in my mind as it was read in the voice of the narrator from the old BBC "Hitchhiker's Guide" mini-series.
I think idea of Babel Fish might encroach on the computational complexity limit in some sense. Imagine a future "Theory of Everything" book written in alien language. The book has total of 1 million characters across its pages where each character is distinct. Now Babel Fish must be able to "translate" such a language to English given its oracle like powers? Can it do the job?
Well, then. Magic indeed!
Also a lot of spoken language involves context that AI is nowhere near understanding yet, let alone all the cultural baggage necessary to accurately translate/localize a lot of utterances.
"Can you stand up?" would be translated differently into Japanese depending on whether you're implying you need them to move their butt off your cell phone versus directly inquiring as to the function of their legs after a car accident. If you speak English and hear it as a background without the rest of the context being picked up, your brain instinctively knows it can interpret it either way, no problem.
But if you're Japanese and the AI picks a specific way to translate it, then you are completely unaware of the ambiguity because the AI resolved it with a 50% chance of being wrong.
nitpicky, but is it though? not really. and it's as much 'difference depending on what you're implying' as there would be in english comparing just saying 'can you stand up' or specifying 'from the seat/at all'.
Probably not the strongest example but there are definitely phrases that are specific in one language but ambiguous in another.
There are certainly nuances, even when 'understood'
Google: "A bit sticky, things are pretty sticky down there."
I'm on mobile so can't find the link but years ago there was a DARPA (iirc) program trying to solve this problem in the context of surveillance in a loud crowded room. Their conclusion was that there needed to be n+1 microphones in the room to be able to cleanly differentiate all of the noise, where n is the number of noise sources, which in their case was number of conversations going on in the room (assuming no other loud sources of noise like music).
I think it's totally doable but you'd need many more microphones in order to deal with real world noise. As MEMS microphone quality improves, this should eventually be possible with a combination of smartphone/headphone/some other device like something around your neck.
The problem is you need a full sentence, plus surrounding sentences to properly translate a lot of things (aka context matters).
So no matter what, conversations in your native speech would have to be delayed before translation.
So then we need something like neuralink to get the whole thought from one's brain first, then the sentences are processed properly for the context, then translated before the speech is delivered.
Most thoughts are in a language. There is no one underlying universal machine language for the brain.
Are most thoughts in language? This doesn’t reflect my experience. Language floats on top, but there is layers under there. You can also feel it when you end up thinking in another language. It does not go through the first one but is a thing of it’s own.
Pretty sure there is nothing universal there though as you say.
My understanding is that they trained a separate model to specifically estimate when they have enough context to begin translating, as a skilled translator would.
My mom used to do English/French translation. Her favorite example was the word "file". That word has multiple translation in French depending on the context, and that context may simply be implied by who is speaking. You may not be able to figure it based on the conversation alone.
Even the native original version needs the proper context. Sometimes you need the entire sentence to figure out what the sentence was really about.
I'm reminded of Mark Twain complaining about verbs arriving at the very end of sentencess in German (among a myriad of other complaints)
"The Awful German Language* -Mark Twain https://faculty.georgetown.edu/jod/texts/twain.german.html
Sometimes you even need a second sentence of even a few to understand what the first sentence was about.
I think I could adapt to that. But it would be an interesting experiment.
Another lesson we can learn from Sci-Fi is very often different species on a planet would have their tribal / local languages and dialects but all spoke a common tongue. I think this is the more humanizing approach, rather than delegate even more of our fleshly processing power to machines.
This seems to be what is happening in Europe (and perhaps more generally across the globe), with English being the common tongue.
Question is, what will happen to the tribal / local languages? Will they survive?
Historically, we've seen the larger languages build themselves up by intentionally stamping out the teaching / use of smaller local languages. France banned some regional languages from appearing on broadcast television for years, and etc.
This might be required to get full buy in for a unified language, which is a bit sad but makes some sense - if you ensure it's taking up more and more of media and culture more people know it from immersion, and other languages are reduced to being spoken at home / with friends and that's going to cut into how many people really are fluent in them.
It varies. A lot of local languages have gone extinct already. There's linguists hard at work to try and document / record dying languages, but it won't be the same as living the language from childhood.
then of course, there's always Darmok and Jalad at Tanagra
I’m wearing the Rayban Meta right now and they are already mind blowing, I can already talk to that Meta AI assistant seamlessly. I bet one of the future iteration will have exactly this.
Curious, what do you ask it besides take a picture / video or what's the weather?
I have a pair and have only asked it that so far...
Whenever I have a question and used to pull up bard/chatGPT, and if I’m wearing my glasses.
Kind of like having an expert next to you all the time.
I look forward to the day when that problem is solved by a company that doesn’t mine my data to sell ads.
Babel Fish