Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export BytesMut::from_vec(vec) #615

Open
al8n opened this issue May 28, 2023 · 5 comments
Open

Export BytesMut::from_vec(vec) #615

al8n opened this issue May 28, 2023 · 5 comments

Comments

@al8n
Copy link

al8n commented May 28, 2023

Hi, can we expose BytesMut::from_vec? Which can help avoid copying when building a BytesMut from Vec<u8>

@al8n al8n changed the title Expose BytesMut::from_vec(vec) Export BytesMut::from_vec(vec) May 28, 2023
@rklaehn rklaehn mentioned this issue Apr 9, 2024
@Darksonn
Copy link
Contributor

That seems reasonable to me. We can add it as a From impl.

@braddunbar
Copy link
Contributor

The comments on from_vec call out the fact that the allocation strategy may change at some point, making this a more expensive operation because it would then involve a copy. I'm not sure how likely that is and I don't have any opinions about exposing it, but I wanted to call it out here since this would be a departure from that stance.

bytes/src/bytes_mut.rs

Lines 824 to 829 in 327615e

// For now, use a `Vec` to manage the memory for us, but we may want to
// change that in the future to some alternate allocator strategy.
//
// Thus, we don't expose an easy way to construct from a `Vec` since an
// internal change could make a simple pattern (`BytesMut::from(vec)`)
// suddenly a lot more expensive.

@Darksonn
Copy link
Contributor

Yes, good point, thanks.

@hackerbirds
Copy link

hackerbirds commented Jun 22, 2024

While it's true that the allocation behind Bytes could change in the future and suddenly make the conversion slower, as a regular user who just needs to work with libraries using Bytes, I don't think it's a good decision to not have a From<Vec<u8>> implementation.

Some libraries use Bytes, others use Vec<u8>, and right now the easiest way I found to convert from a Vec<u8> into Bytes is to deref it into &[u8], and then use From<&[u8]>, which calls to_vec() internally and makes a clone/copy anyway. Given that there is no real other choice (at least to my knowledge, feel free to tell me if there's a better way), I suspect this is what most users also end up doing.

I don't see how that's any better than letting us have a From<Vec<u8>> which would be faster at least today, even in the future where this operation ends up slow. Sometimes we just have no choice to convert back and forth between the types to use foreign libraries. If the sudden performance hit really a concern, then perhaps let from_vec public instead of making a From, and document its behavior publicly (and not privately).

EDIT: Meant BytesMut, sorry.

@demosdemon
Copy link

+1 for From<Vec<u8>> and maybe From<String>.

But I'm curious why From<Vec<u8>> is explicitly called out but the documentation for From<Bytes> says it avoids a copy if the Bytes is unique.

bytes/src/bytes.rs

Lines 1030 to 1050 in 2d996a2

impl From<Bytes> for BytesMut {
/// Convert self into `BytesMut`.
///
/// If `bytes` is unique for the entire original buffer, this will return a
/// `BytesMut` with the contents of `bytes` without copying.
/// If `bytes` is not unique for the entire original buffer, this will make
/// a copy of `bytes` subset of the original buffer in a new `BytesMut`.
///
/// # Examples
///
/// ```
/// use bytes::{Bytes, BytesMut};
///
/// let bytes = Bytes::from(b"hello".to_vec());
/// assert_eq!(BytesMut::from(bytes), BytesMut::from(&b"hello"[..]));
/// ```
fn from(bytes: Bytes) -> Self {
let bytes = ManuallyDrop::new(bytes);
unsafe { (bytes.vtable.to_mut)(&bytes.data, bytes.ptr, bytes.len) }
}
}

And, there is a From<Vec<u8>> (and From<String>) on the Bytes. The From<Vec<u8>> for Bytes impl makes no such promise, but the API exists and performs a zero-copy instantiation of Bytes thus the contract has been set.

bytes/src/bytes.rs

Lines 964 to 997 in 2d996a2

impl From<Vec<u8>> for Bytes {
fn from(vec: Vec<u8>) -> Bytes {
let mut vec = ManuallyDrop::new(vec);
let ptr = vec.as_mut_ptr();
let len = vec.len();
let cap = vec.capacity();
// Avoid an extra allocation if possible.
if len == cap {
let vec = ManuallyDrop::into_inner(vec);
return Bytes::from(vec.into_boxed_slice());
}
let shared = Box::new(Shared {
buf: ptr,
cap,
ref_cnt: AtomicUsize::new(1),
});
let shared = Box::into_raw(shared);
// The pointer should be aligned, so this assert should
// always succeed.
debug_assert!(
0 == (shared as usize & KIND_MASK),
"internal: Box<Shared> should have an aligned pointer",
);
Bytes {
ptr,
len,
data: AtomicPtr::new(shared as _),
vtable: &SHARED_VTABLE,
}
}
}

So, transitively, From<Vec<u8>> for BytesMut should be simply BytesMut::from(Bytes::from(vec)) and ultimately zero-copy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants